How to use mxTextTools
donn at u.washington.edu
Thu Dec 14 19:02:09 CET 2000
Quoth Paul Moore <paul.moore at uk.origin-it.com>:
| I'm looking at mxTextTools to see if it would be suitable for some
| types of text parsing work I am interested in (nothing concrete yet,
| so I can't give specifics...)
| The example in the documentation of tagging HTML looks fine - I
| understand what's going on there, and as I understand it, this will
| give me back a taglist, which is (effectively) the text stream with
| portions tagged as I ask.
| What I dont't see (yet), and I can't find any good examples for, is
| what to do with the resulting taglist. There seem to be no functions
| for working with taglists, and the lists themselves seem like
| relatively complex data structures, so is it right that I should be
| manipulating them "by hand"?
| More information, or better still, some complete examples, would be
| very helpful. (All the examples in the distribution just use
| print_tags() to display the tags, and don't do anything with them...)
Yes, I think that's right, you get to manipulate the tag lists by
hand. That is, the TextTools.tag() function hands you the result
of analysis according to your table structure, and you make of it
what you will. That will naturally depend on your application,
that's why it's up to you.
It's a convenient division of responsibility for me, but the Python
world also has some more classical parsers, where the parser calls
application functions as it analyzes the input. I have tried to use
kwParsing, for example. If that's more like what you're looking for.
Donn Cave, donn at u.washington.edu
More information about the Python-list