[Tutor] how to convert between type string and token

enas khalil enas_khalil at yahoo.com
Tue Nov 15 00:09:23 CET 2005


    hello all
  when i run the code :
  # -*- coding: cp1256 -*-
from nltk.tagger import *
from nltk.corpus import brown
from nltk.tokenizer import WhitespaceTokenizer
  # Tokenize ten texts from the Brown Corpus
train_tokens = []
xx=Token(TEXT=open('fataha2.txt').read())
WhitespaceTokenizer().tokenize(xx)
for l in xx:
    train_tokens.append(l)
#Initialise and train a unigram tagger
mytagger = UnigramTagger(SUBTOKENS='WORDS')
for tok in train_tokens: mytagger.train(tok)
#Once a UnigramTagger has been trained, the tag() method can be used to tag new text: 
text_token = Token(TEXT="ÇáÍãÏ ááå ÑÈ ÇáÚÇáãíä")
WhitespaceTokenizer(SUBTOKENS='WORDS').tokenize(text_token)
mytagger.tag(text_token)
print 'The first example : Using Unigram Tagger the reseults are : '
print 
acc = tagger_accuracy(mytagger, train_tokens)
print ' With Accuracy :Accuracy = %4.1f%%,' % (100 * acc) 
   
  i got the following error :
  Traceback (most recent call last):
  File "F:\MSC first Chapters\unigramgtag1.py", line 14, in -toplevel-
    for tok in train_tokens: mytagger.train(tok)
  File "C:\Python24\Lib\site-packages\nltk\tagger\__init__.py", line 324, in train
    assert chktype(1, tagged_token, Token)
  File "C:\Python24\Lib\site-packages\nltk\chktype.py", line 316, in chktype
    raise TypeError(errstr)
TypeError: 
    Argument 1 to train() must have type: Token
      (got a str)
   
  please i want a help on how to recover this error , in other words how can i convert between type string and token , as im still new in python
  thanks in advance
   


		
---------------------------------
 Yahoo! FareChase - Search multiple travel sites in one click.  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20051114/89f8440e/attachment.html


More information about the Tutor mailing list