[spambayes-dev] Web filter
John Mulholland
sl6dt at cc.usu.edu
Tue Jan 20 17:08:26 EST 2004
I was just rereading some of the old discussions about a bayesian web filter.
I am going to try to write one this semester for grad level class project. I
think that I can find some solutions to problems that you have mentioned.
There are some very interesting characteristics about porn pages. First of
all, they are often linked through java script. It isn't difficult to
automate a process to find lots of them. Maybe I am naive but it seems they
are quite similar and very different from most other web pages. At least
that is the case with most of them. Since one of the purposes of my program
is to protect people from accidently going to a porn site then a false
negative is much more serious then a false positive. I definitely agree that
an open effort to make a base package of n number of sites we definitely want
blocked would be very helpful. To check out sites it is as simple as
telnet abc.com 80
GET / HTTP/1.1
Then you can get the html and analyze it.
If an open effort does start to list sites we should also make sure to have
different categories because someone may not want to look at nude art but
some may think that it is ok.
If people are interested in this please contact me at sl6dt at cc.usu.edu and let
me know. I would appreciate any ideas or suggestions because I am fairly new
to the linux world and there are many things with this project that I have no
idea how to do. My goal is to may a very effective, robust, easy to use,
customizable, free web filter that most people can use, including windows
users.
John Mulholland
More information about the spambayes-dev
mailing list