A Free Idea: Search Engine for Webpages

Serge Boiko serge.boiko at gmx.net
Sat Aug 24 20:34:44 CEST 2002

Thomas Guettler <pan-newsreader at thomas-guettler.de> writes:

> On Sat, 24 Aug 2002 12:58:38 +0200, Serge Boiko wrote:
> > I've just came up with the idea which IMHO looks attracting. Imagine
> > that you have a looong web page and you'd like to find entries of some
> > phrase, not all of them are of interest. You run your software on that
> > page and it builds a list of all entries; clicking on the entry will
> > bring you to the place it occurs. So it's something like pydoc, but
> > works on an arbitrary web-page.
> Sounds not to difficult: Use Pythons HTML parser. Make a dictonary of all
> words of the page. After parsing create a HTML page with all word of the
> page (alphabetically sorted)
Yes, it looks not that difficult to implement using Python. 

> The problem ist that you can link only to anchors in a page (#foo). So
> you need to modifiy the original page and put a mark after each word.
Absolutely, that is why I think about small web-server, which works on
a localhost's free port, you only have to point your browser there and it 
displays the modified web-page.

> > I would love to do it myself, but I'm about to finish my PhD thesis; so
> > I have no chance. If anyone is interested to accomplish such a project I
> > would be happy. Or maybe it's already done? Then I would be happy to
> > know about that.
> I have no time for that. The few hours per week I have for coding in
> spare time is reserved for different things.
>  thomas

Maybe someone else :-)?

I frequently *need* such a functionality while browsing lists of journals, 
online libraries, etc. 


More information about the Python-list mailing list