[Tutor] Sentiment analysis read from a file

Rajesh Balel rajbalel at gmail.com
Wed Mar 28 13:44:46 EDT 2018


seems you have "tab  separated data

with open('Training.txt') as f:
  my_data = [x.strip().split('\t') for x in f.readlines()]

for x in my_data: print x,

Regards
Rajesh


On Wed, Mar 28, 2018 at 10:14 AM, Peter Otten <__peter__ at web.de> wrote:

> Alan Gauld via Tutor wrote:
>
> > On 28/03/18 11:07, theano orf wrote:
> >> I am new in python and I am having problems of how to read a txt file
> and
> >> insert the data in a list,
> >
> > Just a quick response, but your data is more than a text file its a CSV
> > file so the rules change slightly. Especially since you are using the csv
> > module.
> >
> > Your data file is not a CSV file - it is just space separated and the
> > string is not quoted so the CSV default mode of operation won;t
> > work on this data as you seem to expect it to,. You will need to
> > specify the separator (as what? A space wiill split on each word...)
> > CSV might not be the best option here a simple string split combined
> > with slicing might be better.
>
> >>> next(open("training.txt"))
> '1\tThe Da Vinci Code book is just awesome.\n'
>
> So the delimiter would be TAB:
>
> >>> import csv
> >>> next(csv.reader(open("training.txt"), delimiter="\t"))
> ['1', 'The Da Vinci Code book is just awesome.']
>
> >> with open("training.txt", 'r') as file:
> >
> > The CSV module prefers binary files so open it with mode 'rb' not 'r'
>
> That's no longer true for Python 3:
>
> >>> next(csv.reader(open("training.txt", "rb"), delimiter="\t"))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> _csv.Error: iterator should return strings, not bytes (did you open the
> file
> in text mode?)
>
> However, as csv still does its own newline handling it's a good idea to get
> into the habit of opening the file with newline="" as explained here:
>
> https://docs.python.org/dev/library/csv.html#id3
>
> >> reviews = list(csv.reader(file))
> >
> > Try printing the first 2 lines of reviews to check what you have.
> > I suspect it's not what you think.
> >
> >>    positive_review = [r[1] for r in reviews if r[0] == str(1)]
> >
> > str(1) is just '1' so you might as well just use that.
> >
> >> after the print I only take an empty array. Why is this happening? I am
> >> attaching also the training.txt file
> >
> > See the comments above about your data format.
> >
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list