[Tutor] Catching cheaters with Python

Timothy Wilson wilson@visi.com
Thu, 29 Nov 2001 10:39:11 -0600 (CST)


Greetings all:

As a high school teacher I'm confronted regularly with the problem of
student plagiarism. Students are tempted to submit the same paper to
different instructors, copy large chunks of text from Web sites, and, much
less frequently, purchase research papers online. Google is an amazing tool
for catching the second type. Simply identify a particularly interesting
sentence from the suspected paper, enter the sentence into Google
surroundied by quotes, and bingo, you've got it. It typically takes less
than 10 seconds.

A year or so ago a professor at some university made the news when he wrote
a program that would automatically scan submitted papers and identify
passages that were likely plagiarized from other students. I would like to
extend that to do the following:

1. Check for similarities between submitted papers and
2. Submit suspicious sentences to google for searching.

I've thought that it would be possible to bring all of our students to the
computer lab on the day the paper is due and have them upload their paper
via a Web form. I use Zope here so I could write some code to convert
MSWord-formatted papers to plain text. Students who didn't use Word could
cut and paste their text directly into a textarea on the form.

Does anyone have an hints on something like this? Any modules that would be
particularly useful? Obviously, this could be a huge AI project, but I'm not
interested in that particularly. A brute force approach is all I'm smart 
enough to try. :-) Once I've got the text from their papers I can set it up
to run in the background to run for days if I have to. That said, the more
efficiently the program could identify matching patterns the better.

I'd love to hear others' thoughts on this.

-Tim

--
Tim Wilson      |   Visit Sibley online:   | Check out:
Henry Sibley HS |  http://www.isd197.org   | http://www.zope.com
W. St. Paul, MN |                          | http://slashdot.org
wilson@visi.com |  <dtml-var pithy_quote>  | http://linux.com