pattern
Cameron Simpson
cs at cskk.id.au
Thu Jun 14 23:42:03 EDT 2018
On 14Jun2018 20:01, Sharan Basappa <sharan.basappa at gmail.com> wrote:
>> >Can anyone explain to me the purpose of "pattern" in the line below:
>> >
>> >documents.append((w, pattern['class']))
>> >
>> >documents is declared as a list as follows:
>> >documents.append((w, pattern['class']))
>>
>> Not without a lot more context. Where did you find this code?
>
>I am sorry that partial info was not sufficient.
>I am actually trying to implement my first text classification code and I am referring to the below URL for that:
>
>https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6
Ah, ok. It helps to include some cut/paste of the relevant code, though the URL
is a big help.
The wider context of the code you recite looks like this:
words = []
classes = []
documents = []
ignore_words = ['?']
# loop through each sentence in our training data
for pattern in training_data:
# tokenize each word in the sentence
w = nltk.word_tokenize(pattern['sentence'])
# add to our words list
words.extend(w)
# add to documents in our corpus
documents.append((w, pattern['class']))
and the training_data is defined like this:
training_data = []
training_data.append({"class":"greeting", "sentence":"how are you?"})
training_data.append({"class":"greeting", "sentence":"how is your day?"})
... lots more ...
So training data is a list of dicts, each dict holding a "class" and "sentence"
key. The "for pattern in training_data" loop iterates over each item of the
training_data. It calls nltk.word_tokenize on the 'sentence" part of the
training item, presumably getting a list of "word" strings. The documents list
gets this tuple:
(w, pattern['class'])
added to it.
In this way the documents list ends up with tuples of (words, classification),
with the words coming from the sentence via nltk and the classification coming
straight from the train item's "class" value.
So at the end of the loop the documents array will look like:
documents = [
( ['how', 'are', 'you'], 'greeting' ),
( ['how', 'is', 'your', 'day', 'greeting' ),
]
and so forth.
Cheers,
Cameron Simpson <cs at cskk.id.au>
More information about the Python-list
mailing list