[Tutor] how to convert between type string and token

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Tue Nov 15 01:42:31 CET 2005



> Ok.  I see something suspicious here.  The for loop:
>
> ######
> for l in xx:
>     train_tokens.append(l)
> ######
>
> assumes that we get tokens from the 'xx' token.  Is this true?  Are you
> sure you don't have to specifically say:
>
> ######
> for l in xx['SUBTOKENS']:
>     ...
> ######


Hi Enas,

I'm taking python-list again out of CC: my apologies to the others for not
catching that sooner.

Enas, please do not crosspost to multiple mailing lists.  You have been
doing this since at least June:

    http://mail.python.org/pipermail/python-list/2005-June/287371.html
    http://mail.python.org/pipermail/tutor/2005-June/039351.html

    http://mail.python.org/pipermail/tutor/2005-July/039642.html
    http://mail.python.org/pipermail/python-list/2005-July/289505.html

    http://mail.python.org/pipermail/tutor/2005-October/042155.html
    http://mail.python.org/pipermail/python-list/2005-October/303805.html

If you crosspost, we at Tutor won't be able to see responses that go to
python-list, and visa-versa.  The end result clutters both lists and isn't
friendly to either community.  Please read:

    http://www.gweep.ca/~edmonds/usenet/ml-etiquette.html

and try to change your habits in this area.



Anyway, just as a concrete example of this:

######
>>> from nltk.tokenizer import *
>>> text_token = Token(TEXT='hello world this is a test')
>>> text_token.keys()
['TEXT']
>>> WhitespaceTokenizer().tokenize(text_token)
>>> text_token.keys()
['TEXT', 'SUBTOKENS']
>>> text_token['SUBTOKENS']
[<hello>, <world>, <this>, <is>, <a>, <test>]
>>> type(text_token['SUBTOKENS'][0])
<class 'nltk.token.Token'>
######

Do you understand this code, or is there something here that you're not
familar with?


Good luck to you.



More information about the Tutor mailing list