[Tutor] creating a corpus from a csv file

Treder, Robert Robert.Treder at morganstanley.com
Fri May 3 22:48:40 CEST 2013


Hi, 
 
I'm very new to python and am trying to figure out how to make a corpus from a text file. I have a csv file (actually pipe '|' delimited) where each row corresponds to a different text document. Each row contains a communication note. Other columns correspond to categories of types of communications. I am able to read the csv file and print the notes column as follows: 
 
import csv
with open('notes.txt', 'rb') as infile:
    reader = csv.reader(infile, delimiter = '|')
    i = 0
    for row in reader:
    if i <= 25: print row[8]
    i = i+1

I would like to convert this to a categorized corpus with some of the other columns corresponding to the categories. All of the columns are text (i.e., strings). I have looked for documentation on how to use csv.reader with PlaintextCorpusReader but have been unsuccessful in finding a  example similar to what I want to do. Can someone please help?  
 
Thanks, 
Bob


--------------------------------------------------------------------------------

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


More information about the Tutor mailing list