[Tutor] NLTK
Ishan Puri
ballerz4ishi at sbcglobal.net
Sat Aug 29 01:29:27 CEST 2009
Hi,
>>> from nltk.corpus import PlaintextCorpusReader
>>> corpus_root='C:\Users\Ishan\Documents'
>>> wordlists = PlaintextCorpusReader(corpus_root, 'IM50re.txt')
>>> wordlists.fileids()
['IM50re.txt']
This is the result I get. I was wondering how I can use the packages on IM50re.txt? I followed successfully the steps detailed under Using Your Own Corpus. What do I do next, say, if I wanted to use the lemmatizer on this .txt document?
Thank you.
________________________________
From: Kent Johnson <kent37 at tds.net>
To: Ishan Puri <ballerz4ishi at sbcglobal.net>
Cc: *tutor python <tutor at python.org>
Sent: Friday, August 28, 2009 4:24:19 PM
Subject: Re: [Tutor] NLTK
On Fri, Aug 28, 2009 at 6:09 PM, Ishan Puri<ballerz4ishi at sbcglobal.net> wrote:
> Hi,
> Thanks for your response. I tried this and got to the 3rd line. However,
> when I type in the fourth:
>
>>>> wordlists.fileids()
>
> a blank comes as a result. When I try the len() function it only counts the
> letters in title of the
> text document IM50re.txt. How do I get it to open and analyze the text, as
> they have done
> with the Gutenberg texts at the beginning of the chapter?
Did you give the correct path to your files? How did you use len()? It
helps if you show what you tried and what result you got.
Please Reply All to reply to the list.
Kent
> Thank you.
>
>
>
> ________________________________
> From: Kent Johnson <kent37 at tds.net>
> To: Ishan Puri <ballerz4ishi at sbcglobal.net>
> Cc: Python Tutor <tutor at python.org>
> Sent: Friday, August 28, 2009 4:49:40 AM
> Subject: Re: [Tutor] NLTK
>
> On Fri, Aug 28, 2009 at 3:14 AM, Ishan Puri<ballerz4ishi at sbcglobal.net>
> wrote:
>> Hello,
>> I have successfully downloaded NLTK and the toy grammars. I want to
>> run
>> a few of the packages that come with NLTK on corpora that I have. How do I
>> do this? What commands would I use? The corpora are text files; should I
>> put
>> them in the Python25 folder (is that the so called same directory)?
>
> The section Loading your own Corpus in the book seems to show what you want:
> http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html
>
> Kent
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090828/6c5c8e22/attachment.htm>
More information about the Tutor
mailing list