intermediate python csv reader/writer question from a beginner
Nick Craig-Wood
nick at craig-wood.com
Tue Feb 24 04:31:53 EST 2009
Learning Python <labmice at gmail.com> wrote:
> anything related to csv, I usually use VB within excel to manipulate
> the data, nonetheless, i finally got the courage to take a dive into
> python. i have viewed a lot of googled csv tutorials, but none of
> them address everything i need. Nonetheless, I was wondering if
> someone can help me manipulate the sample csv (sample.csv) I have
> generated:
>
> ,,
> someinfo,,,,,,,
> somotherinfo,,,,,,,
> SEQ,Names,Test1,Test2,Date,Time,,
> 1,Adam,1,2,Monday,1:00 PM,,
> 2,Bob,3,4,Monday,1:00 PM,,
> 3,Charlie,5,6,Monday,1:00 PM,,
> 4,Adam,7,8,Monday,2:00 PM,,
> 5,Bob,9,10,Monday,2:00 PM,,
> 6,Charlie,11,12,Monday,2:00 PM,,
> 7,Adam,13,14,Tuesday,1:00 PM,,
> 8,Bob,15,16,Tuesday,1:00 PM,,
> 9,Charlie,17,18,Tuesday,1:00 PM,,
>
> into (newfile.csv):
>
> Adam-Test1,Adam-Test2,Bob-Test1,Bob-Test2,Charlie-Test1,Charlie-
> Test2,Date,Time
> 1,2,3,4,5,6,Monday,1:00 PM
> 7,8,9,10,11,12,Monday,2:00 PM
> 13,14,15,16,17,18,Tuesday,1:00 PM
>
> note:
> 1. the true header doesn't start line 4 (if this is the case would i
> have to use "split"?)
> 2. if there were SEQ#10-12, or 13-15, it would still be Adam, Bob,
> Charlie, but with different Test1/Test2/Date/Time
I'm not really sure what you are trying to calculate, but this should
give you some ideas...
import csv
from collections import defaultdict
reader = csv.reader(open("sample.csv"))
result = defaultdict(list)
for row in reader:
# ignore unless first row is numeric
if not row or not row[0].isdigit():
continue
n, name, a, b, day, time = row[:6]
print "n=%r, name=%r, a=%r, b=%r, day=%r, time=%r" % (n, name, a,
b, day, time)
result[(day, time)].append(n)
writer = csv.writer(open("newfile.csv", "w"))
for key, values in result.iteritems():
day, time = key
values = values + [day, time]
writer.writerow(values)
This prints
n='1', name='Adam', a='1', b='2', day='Monday', time='1:00 PM'
n='2', name='Bob', a='3', b='4', day='Monday', time='1:00 PM'
n='3', name='Charlie', a='5', b='6', day='Monday', time='1:00 PM'
n='4', name='Adam', a='7', b='8', day='Monday', time='2:00 PM'
n='5', name='Bob', a='9', b='10', day='Monday', time='2:00 PM'
n='6', name='Charlie', a='11', b='12', day='Monday', time='2:00 PM'
n='7', name='Adam', a='13', b='14', day='Tuesday', time='1:00 PM'
n='8', name='Bob', a='15', b='16', day='Tuesday', time='1:00 PM'
n='9', name='Charlie', a='17', b='18', day='Tuesday', time='1:00 PM'
And leaves newfile.csv with the contents
1,2,3,Monday,1:00 PM
7,8,9,Tuesday,1:00 PM
4,5,6,Monday,2:00 PM
--
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
More information about the Python-list
mailing list