[Tutor] NLTK

Kent Johnson kent37 at tds.net
Sat Aug 29 04:03:15 CEST 2009


On Fri, Aug 28, 2009 at 7:29 PM, Ishan Puri<ballerz4ishi at sbcglobal.net> wrote:
> Hi,
>>>> from nltk.corpus import PlaintextCorpusReader
>>>> corpus_root='C:\Users\Ishan\Documents'
>>>> wordlists = PlaintextCorpusReader(corpus_root, 'IM50re.txt')
>>>> wordlists.fileids()
> ['IM50re.txt']
>
> This is the result I get.

That seems to be working then. You should be able to get a list of words with
wordlists.words('IM50re.txt')

> I was wondering how I can use the packages on
> IM50re.txt? I followed successfully the steps detailed under Using Your Own
> Corpus. What do I do next, say, if I wanted to use the lemmatizer on this
> .txt document?

I have no idea. Is IM50re.txt a plain text corpus? What is a package?
What is a lemmatizer?

I don't know anything about NLTK, I'm just good at reading manuals.
You have to give me more help than that. What have you tried? Can you
find an example that is similar to what you want to do? Don't assume I
know what you are talking about :-)

Kent


More information about the Tutor mailing list