[Tutor] rfc822.Message [making Python-tutor more searchable]
Danny Yoo
dyoo@hkn.eecs.berkeley.edu
Sun Mar 16 23:56:01 2003
On Sun, 16 Mar 2003, Erik Price wrote:
> I would like to see it. I like Java and use it at work, so I've been
> meaning to learn more about Jython.
Hi Erik,
Argh. I wasted all my time today watching TV. *grin*
But since I did promise I'd put something up, I'd better not renege on
that.
I've written two Jython scripts: one reads in a mailbox and, using the
Jakarta Lucene engine, creates a directory of message indices. The second
script tests to see if the indexing actually worked out ok.
Since the files are a bit large, I'll just post temporary urls to them:
http://hkn.eecs.berkeley.edu/~dyoo/jython/load_indices.jy
http://hkn.eecs.berkeley.edu/~dyoo/jython/test_searching.jy
As a sample source of messages, I used the Python-Tutor archive:
http://mail.python.org/pipermail/tutor.mbox/tutor.mbox
To run the engine, you'll need to grab the Jakarta Lucene library; it's
located here:
http://jakarta.apache.org/lucene/docs/index.html
I have to admit that my scripts are really really messy and not quite...
umm... commented yet. But I hope to fix that. *grin* Please feel free to
read and give feedback on anything that looks silly about the program.
If the indexing itself looks robust, I'd love to set this up as an
improved search engine for Python-Tutor; does anyone have a machine they'd
be willing to put something like this on?
To whet people's appetite, here's a sample of the kinds of queries we can
do:
###
query? from:erik AND jython
2 hits found.
14153 Erik Price <erikprice@mac.com> tutor@python.org [Tutor] jython and
dynamic typing
20707 Erik Price <erikprice@mac.com> alan.gauld@bt.com Re: [Tutor] Sun
says: Don't use Java, use Python!
query? danny AND lucene
1 hits found.
21230 Danny Yoo <dyoo@hkn.eecs.berkeley.edu> Bob Gailer
<ramrom@earthling.net> Re: [Tutor] Enter: Matt
query? "hello world"
344 hits found.
... [cut short, since output was a bit long... *grin*]
query? subject:"help"
1134 hits found.
... [cut short for obvious reasons. *grin*]
query? body:"hope this helps"
990 hits found.
###
Let's make that one more. *grin* Hope this helps!