[Tutor] Adding to a CSV file?

Sun Aug 29 20:12:06 CEST 2010

Hi,

I'm learning Python so I can take advantage of the really cool stuff in the Natural Language Toolkit. But I'm having problems with some basic file manipulation stuff.

My basic question: How do I read data in from a csv, manipulate it, and then add it back to the csv in new columns (keeping the manipulated data in the "right row")?

Here's an example of what my data looks like ("test-8-29-10.csv"):

MyWord

Category

Ct

CatCt

!

A

2932

456454

!

B

2109

64451

a

C

7856

90000

a

A

19911

456454

abnormal

C

174

90000

abnormally

D

5

77777

cats

E

1999

886454

cat

B

160

64451

# I want to read in the MyWord for each row and then do some stuff to it and add in some new columns. Specifically, I want to "lemmatize" and "stem", which basically means I'll turn "abnormally" into "abnormal" and "cats" into "cat".

import nltk
wnl=nltk.WordNetLemmatizer()
porter=nltk.PorterStemmer()
text=nltk.word_tokenize(TheStuffInMyWordColumn)
textlemmatized=[wnl.lemmatize(t) for t in text]
textPort=[porter.stem(t) for t in text]

# This creates the right info, but I don't really want "textlemmatized" and "textPort" to be independent lists, I want them inside the csv in new columns. 

# If I didn't want to keep the information in the Category and Counts columns, I would probably do something like this:

for word in text:
word2=wnl.lemmatize(word)
word3=porter.stem(word)
print word+";"+word2+";"+word3+"\r\n")

# Looking through some of the older discussions about the csv module, I found this code helps identify headers, but I'm still not sure how to use them--or how to word the for-loop that I need correctly so I iterate through each row in the csv file. 

f_out.close()
fp=open(r'c:test-8-29-10.csv', 'r')
inputfile=csv.DictReader(fp)
for record in inputfile:
print record
{'Category': 'A', 'CatCt': '456454', 'MyWord': '!', 'Ct': '2932'}
{'Category': 'B', 'CatCt': '64451', 'MyWord': '!', 'Ct': '2109'}
...
fp.close() 

# So I feel like I have *some* of the pieces, but I'm just missing a bunch of little connections. Any and all help would be much appreciated!

Tyler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100829/e4291f15/attachment-0001.html>