Dr. Dobb's Python-URL! - weekly Python news and links (Dec 17)

Hans Nowak wurmy at earthlink.net
Mon Dec 17 22:19:33 CET 2001


Paul Boddie wrote:

>     Mygale seems to make the "Python-URL!" editor's job redundant...
>     or easier?
>         http://www.awaretek.com/nowak/mygale.html

Easier. A program is pretty poor at judging whether an article is
related to Python or not. :-)

I might as well make this a semi-official announce:

Mygale is a specialized webcrawler that searches news sites for
Python-related articles. Current version is 0.6.x, and it should
be reasonably stable. Not everything I wanted has been implemented
yet, though.

The main problem is that some news sites seem to return any old
article when you do a search for "python". Some articles just
contain the word, but are unrelated; can't really blame the
site's search engine for this. Other articles don't even contain
the p-word at all. >=(  This kind of thing is very hard to 
filter out (at least I think so). There are also problems with
determining the dates of some articles.

More info at the aforementioned site. Suggestions, bug reports,
patches, new extractors, etc. are welcome. :)

--Hans



More information about the Python-list mailing list