[Tutor] Help Please

DL Neil PyTutor at danceswithmice.info
Thu Feb 21 00:27:21 EST 2019


Mario,

On 21/02/19 3:30 AM, Mario Ontiveros wrote:
> Hello,
>      I am new to python and have been stuck on this for a while. What I am trying to do is to remove rows with void, disconnected, and error on lines. The code I have does that, the only problem is that it removes my header because void is in header. I need to keep header.
> with open("PSS.csv","r+") as f:
>      new_f = f.readlines()
>      f.seek(0)
>      for line in new_f:
>          if "Void" not in line:
>              if "Disconnected" not in line:
>                  if "Error" not in line:
>                   f.write(line)
>      f.truncate()


Would it be 'safer' to create a separate output file?

Rather than reading the entire file (easily managed if short, but 
unwieldy and RAM-hungry if thousands of records!), consider that a file 
object is an iterable and process it one line/record at a time.

with open( ... ) as f:
	header = f.readline()
	# deal with the header record
	for record in f:
		function_keep_or_discard( record )
	#etc


In case it helps you to follow the above, and possibly to learn other 
applications of this thinking, herewith:-

An iterable matches a for-each-loop very neatly (by design). It consists 
of two aspects: next() ie give me the next value (thus for each value in 
turn), and the StopIteration exception (when next() asks for another 
value after they have all been processed). The for 'swallows' the 
exception because it is expected. Hence, you don't need to try...except!

Something a lot of pythonistas don't stop to consider, is that once code 
starts iterating an object, the iteration does not 'reset' until 
"exhausted" (unlike your use of f.seek(0) against the output file). 
Accordingly, we can use a 'bare' next() to pick-out the first (header) 
record and then pass the rest of the job (all the other next()s) to a 
for-each-loop:

with open( ... ) as f:
	header = next( f )	# grab the first record
	# deal with the header record
	for record in f:	# iterate through the remaining records
		#etc

-- 
Regards =dn


More information about the Tutor mailing list