[Tutor] Strings and file saving

Doug Stanfield DOUGS@oceanic.com
Fri, 18 Aug 2000 09:45:57 -1000


> I think I am a classic case of PDD ... Python Deficit Disorder :).
> At best I'm one of those slow learners, so bear with me, please.
> 
> Dan

Its more ADD (not attention, algorithm).  I used to be there too, and still
am for more advanced questions.  It's a condition that is cured by exposure
to cool solutions.  Usually administered in a boring set of college classes,
but much easier on the stomach if taken in measured doses of reading Python
code by the gurus.

> Objective: 
> 1)  Read this ASCII file that is space deliminated;
You got this right
> 2)  Pull out only the lines that have a '$' characters in the line;
This needs a new idea.  Try this:

# outside of the loop set up a variable to hold the results
result = ""    # in this case I'm going to build a string because
               # thats what I need to write out.  The readlines
               # method gave you a list of strings, so if you
               # really needed to work with hat list you could make
               # this result = [], the empty list.
# read each line in the file
for line in in_file.readlines():
    if '$' in line:  # Untested but I think this does the same thing
                     # your test did.  Its probably faster if it works.
> #		# "print line" works well
> #		#print line
>       a = line  # its usually nicer to use a descriptive variable
                      # name.  Six months from now you'll know why.
        result = result + line  # put the current line on the end of
                                # the result till now.  If result was
                                # a list this would be result.append(line)
# by the time you're program gets to here the variable 'result' holds
# everything you want.

> 3)  Save the data to a new file.
I'd put together all the result and write it in one fell swoop to the file.
That may just be me.  It should be possible to do it the way you had it
within the inner loop.  I think its faster if its all done at once.  More
comments below though.

out_file.write(result) # should do it, just be sure its at the right indent
level.
> 
> #  ----------- Code -----------  
> # import sys and string
> import sys
> import string
> 
> # get the appropriate data file
> in_file = open("data_file","r")
> 
> # create the file to export the data to
> out_file = open("junk2.txt","w+")
> 
> # read each line in the file	
> for line in in_file.readlines():
> 	if string.find(line,'$')>= 0:
> #		# "print line" works well
> #		#print line
> 		a = line
> #	# write the new file you created to the file "junk2.txt" 
> 	out_file.write(a)
> out_file.close()
> in_file.close()
> #  ----------- End of Code --------------
> 
> Traceback (innermost last):
>   File "today6.py", line 19, in ?
>     out_file.write(a)
> NameError: a

The reason this gives a traceback is that the first time through the test
loop the line didn't have a '$' in it.  Therefore 'a' was not set, in fact
that line of code wasn't even run.  You've put the write statement at the
same indent level as the if that is finding the '$' character.  This means
you're trying to write for every line in the input.  You just want to write
for every line that has the '$' so the write as you have it would have to be
inside the test.  The way it is, if the first line had a $, you wouldn't
have gotten the traceback, it would have run, but you would have gotten a
repeat of the first line for every non-'$' line read.  each '$' line would
have been repeated in this way.

Hope this has been helpful.  This tutor list is definately the best place to
work on these kinds of questions.

-Doug-