Python editing .txt file

187braintrust at berkeley.edu 187braintrust at berkeley.edu
Tue Jun 15 21:28:44 EDT 2010


I am trying to write a program in Python that will edit .txt log files that
contain regression output from R.  Any thoughts or suggestions would be
greatly appreciated.

To get an idea of what I am trying to do, note that I include fixed effects
in the R regressions, resulting in hundreds of extra lines per regression
which I am not interested in right now.  Basically, I want to save a
shortened version of the .txt files in which the blocks of fixed
effects coefficients are replaced by a line that says includes fixed effects
for whatever variable it is.

All the lines that are to be deleted start with the same six characters --
'factor(xyz)' where xyz is the variable name -- so my idea is to have Python
copy each line to a new file if the first six characters do not match
'factor('.

That part I at least know how to approach.  However,  I am not sure how to
approach adding the line that says, "includes fixed effects for xyz."  The
problem I am having is how to approach the following:


1. In the resulting file, I will be skipping blocks of lines, say anywhere
from 10 to 500 or so, and inserting one line -- i.e., whether it inserts the
line needs to depend on whether it's the first line or one of the remaining
499 lines.

2. the xyz variable name is different lengths depending on what variable it
is.  For example, one block might be 'state' and another block might be
'yr'.  Maybe I can use the fact that the var name starts after the first '('
and ends at the first ')' in the line?  I think I can use the re module for
this?


Any suggestions on any aspect of this, but especially the latter part, would
be greatly appreciated.  Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100615/e6515dc2/attachment.html>


More information about the Python-list mailing list