[Tutor] Fwd: thesaurus

Rich Lovely roadierich at googlemail.com
Thu Jul 9 05:12:55 CEST 2009


2009/7/9 Pete Froslie <froslie at gmail.com>:
> I see.. that makes sense. Kind of new with python -- sorry for that.
>
> after printing using this:
>
> print
> urllib.urlopen('http://words.bighugelabs.com/api/2/e413f24801aa30b8d441ca43a64317be/moving/').read()
>
> I'm getting a format like this returned:
>
> adjective|sim|streaming
> adjective|sim|swirling
> adjective|sim|tossing
> adjective|sim|touching
> adjective|sim|touring
> adjective|sim|traveling
> adjective|sim|tumbling
>
> I assume I need to clean this up by reading past  'Adjective|sim|' to
> 'streaming' and then returning it from lookup()..
> this will be happening in the following:
>
> urllib.urlopen('http://words.bighugelabs.com/api/2/e413f24801aa30b8d441ca43a64317be/moving/').read(SOMETHING
> HERE)
>
>
>
I don't think there is any easy way of doing that.

You would be better off using the split method of the strings, and
ignoring the first parts for now.

Have you considered what happens with the word set in the following
sentance, about testing TV receivers:

We've set the code number of each set to a reasonable value for this
set of experiments.

I can't get the api to work for me, but the way you're doing it at the
moment, you'd end up with something like

We've fixed the code number of each fixed to a reasonable value for
this fixed of experiments.

A contrived example, I know, but it makes the point.

Unless, of course, this sort of gibberish is what you're after.

Natural language parsers are one of the hardest things to create.
Just look up the word "set" in a dictionary to see why.  Even if you
did work out that the second "set" was a noun, is it "a radio or
television receiver" or "a group or collection of things that belong
together, resemble one another, or are usually found together"

-- 
Richard "Roadie Rich" Lovely, part of the JNP|UK Famile
www.theJNP.com


More information about the Tutor mailing list