fixing an horrific formatted csv file.
flebber.crue at gmail.com
Fri Jul 4 06:12:15 CEST 2014
I have taken the code and gone a little further, but I need to be able to protect myself against commas and single quotes in names.
How is it the best to do this?
so in my file I had on line 44 this trainer name.
"Michael, Wayne & John Hawkes"
and in line 95 this horse name.
this throws of my capturing correct item 9. How do I protect against this?
Here is current code.
from sys import argv
SCRIPT, FILENAME = argv
"""take an input file and keep the name with appended _clean"""
file_parts = file_name.split(".",)
output_file = file_parts + '_clean.' + file_parts
"""utility to reorganise poorly made csv entry"""
input_table = [[item.strip(' "') for item in record.split(',')]
for record in text_file.splitlines()]
# At this point look at input_table to find the record indices
output_table = 
for record in input_table:
if record == 'Meeting':
meeting = record
elif record == 'Race':
date = record
race = record
elif record == 'Horse':
number = record
name = record
results = record
res_split = re.split('[- ]', results)
starts = res_split
wins = res_split
seconds = res_split
thirds = res_split
prizemoney = res_split
trainer = record
location = record
print(name, wins, seconds)
output_table.append((meeting, date, race, number, name,
starts, wins, seconds, thirds, prizemoney,
MY_FILE = out_file_name(FILENAME)
# with open(FILENAME, 'r') as f_in, open(MY_FILE, 'w') as f_out:
# for line in race_table(f_in.readline()):
# new_row = line
with open(FILENAME, 'r') as f_in, open(MY_FILE, 'w') as f_out:
CONTENT = f_in.read()
FILE_CONTENTS = race_table(CONTENT)
# print new_name
if __name__ == '__main__':
More information about the Python-list