Hello, We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions? Thanks, Paul
On Mon, Mar 18, 2013 at 6:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
You can see that it is enabled by visiting this link: http://wiki.python.org/moin/FrontPage?action=newaccount (make sure you are logged out) As for the questions, I'm open to suggestions. -- Radomir Dopieralski, http://sheep.art.pl
On Monday 18 March 2013 19:00:37 Radomir Dopieralski wrote:
On Mon, Mar 18, 2013 at 6:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
You can see that it is enabled by visiting this link: http://wiki.python.org/moin/FrontPage?action=newaccount
(make sure you are logged out)
As for the questions, I'm open to suggestions.
I don't think "How many words are in this question?" is really setting the bar very high for spammers. Textcha questions are supposed to retain the context of the site on which they are placed so that bulk spamming cannot just scrape the question and serve it up to someone on some other site. This means that we should be asking Python-related questions, not simple "Are you human?" questions that ceased to be effective about ten years ago. Paul
On 18.03.2013 19:58, Paul Boddie wrote:
On Monday 18 March 2013 19:00:37 Radomir Dopieralski wrote:
On Mon, Mar 18, 2013 at 6:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
You can see that it is enabled by visiting this link: http://wiki.python.org/moin/FrontPage?action=newaccount
(make sure you are logged out)
As for the questions, I'm open to suggestions.
I don't think "How many words are in this question?" is really setting the bar very high for spammers. Textcha questions are supposed to retain the context of the site on which they are placed so that bulk spamming cannot just scrape the question and serve it up to someone on some other site. This means that we should be asking Python-related questions, not simple "Are you human?" questions that ceased to be effective about ten years ago.
Reimar is currently running a test on the Jython wiki. He marked "http" as bad content, which results in all edits including that word to get rejected. At least on the Jython wiki, this has apparently stopped the spam pages from getting created: http://wiki.python.org/jython/RecentChanges It's not a permanent solution, though, since it prevents adding links to pages. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 19 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 19.03.2013 09:14, M.-A. Lemburg wrote:
On 18.03.2013 19:58, Paul Boddie wrote:
On Monday 18 March 2013 19:00:37 Radomir Dopieralski wrote:
On Mon, Mar 18, 2013 at 6:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
You can see that it is enabled by visiting this link: http://wiki.python.org/moin/FrontPage?action=newaccount
(make sure you are logged out)
As for the questions, I'm open to suggestions.
I don't think "How many words are in this question?" is really setting the bar very high for spammers. Textcha questions are supposed to retain the context of the site on which they are placed so that bulk spamming cannot just scrape the question and serve it up to someone on some other site. This means that we should be asking Python-related questions, not simple "Are you human?" questions that ceased to be effective about ten years ago.
Reimar is currently running a test on the Jython wiki. He marked "http" as bad content, which results in all edits including that word to get rejected.
At least on the Jython wiki, this has apparently stopped the spam pages from getting created:
http://wiki.python.org/jython/RecentChanges
It's not a permanent solution, though, since it prevents adding links to pages.
I've added a new set of textchas to the Python wiki. -- Marc-Andre Lemburg PSF Vice Chairman
On Wed, Mar 20, 2013 at 12:52 AM, Paul Boddie <paul@boddie.org.uk> wrote:
On Tuesday 19 March 2013 09:30:13 M.-A. Lemburg wrote:
I've added a new set of textchas to the Python wiki.
Thanks to both of you for configuring the wiki and improving the textcha questions. This makes it a lot easier and more rewarding to keep maintaining the wiki content.
I don't know. I found myself run out of patience typing those new long phrases. Users with editing history should not suffer. Is it complicated to find a place in MoinMoin to insert a check of len(user.edits) > 5 for textcha display and validation? -- anatoly t.
On 20.03.2013 08:33, anatoly techtonik wrote:
On Wed, Mar 20, 2013 at 12:52 AM, Paul Boddie <paul@boddie.org.uk> wrote:
On Tuesday 19 March 2013 09:30:13 M.-A. Lemburg wrote:
I've added a new set of textchas to the Python wiki.
Thanks to both of you for configuring the wiki and improving the textcha questions. This makes it a lot easier and more rewarding to keep maintaining the wiki content.
I don't know. I found myself run out of patience typing those new long phrases. Users with editing history should not suffer. Is it complicated to find a place in MoinMoin to insert a check of len(user.edits) > 5 for textcha display and validation?
The only way to disable textchas is by adding the user name to an editor wiki group. If we can't get things under control, we will have to start using such a group and disable public editing of pages :-( -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Mar 20, 2013, at 10:08 AM, M.-A. Lemburg wrote:
If we can't get things under control, we will have to start using such a group and disable public editing of pages :-(
FWIW, we've long had this policy on the Mailman wiki. It was the only successful way to control spam. Sign up is open (and yes, we occasionally have to delete profile spam), but folks have to email the mailman-cabal to request write/edit access. We don't get a lot of such requests, so the process is quite manageable, and we get almost zero spam now. -Barry
On 20.03.2013 15:30, Barry Warsaw wrote:
On Mar 20, 2013, at 10:08 AM, M.-A. Lemburg wrote:
If we can't get things under control, we will have to start using such a group and disable public editing of pages :-(
FWIW, we've long had this policy on the Mailman wiki. It was the only successful way to control spam. Sign up is open (and yes, we occasionally have to delete profile spam), but folks have to email the mailman-cabal to request write/edit access. We don't get a lot of such requests, so the process is quite manageable, and we get almost zero spam now.
Did this have an effect on the number of editors of the wiki ? The usual complaint when doing this is that you prevent quick edits (e.g. typo corrections) by raising the bar in this way. After the recent updates to the textchas, the profile spam has apparently stopped: http://wiki.python.org/jython/RecentChanges So perhaps making the textchas a little more complicated and also starting a process to accept people to the trusted editor group would solve the problem. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-03-13: Released eGenix pyOpenSSL 0.13 ... http://egenix.com/go39 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Mar 22, 2013, at 12:19 PM, M.-A. Lemburg wrote:
Did this have an effect on the number of editors of the wiki ?
It probably did, but it's hard to gauge. It's not like we had a ton of non-spam editors previously. We never denied anybody write access if they asked though.
The usual complaint when doing this is that you prevent quick edits (e.g. typo corrections) by raising the bar in this way.
Yep. Unfortunately, it was taking way too much of our time weeding out the spam, so before the policy, our wiki was arguably less useful. -Barry
On Friday 22 March 2013 12:19:05 M.-A. Lemburg wrote:
On 20.03.2013 15:30, Barry Warsaw wrote:
On Mar 20, 2013, at 10:08 AM, M.-A. Lemburg wrote:
If we can't get things under control, we will have to start using such a group and disable public editing of pages :-(
FWIW, we've long had this policy on the Mailman wiki. It was the only successful way to control spam. Sign up is open (and yes, we occasionally have to delete profile spam), but folks have to email the mailman-cabal to request write/edit access. We don't get a lot of such requests, so the process is quite manageable, and we get almost zero spam now.
Did this have an effect on the number of editors of the wiki ?
The usual complaint when doing this is that you prevent quick edits (e.g. typo corrections) by raising the bar in this way.
After the recent updates to the textchas, the profile spam has apparently stopped:
That no longer seems to be the case. I am also having to remove these stupid profile spams from the Python Wiki two or more times per day.
So perhaps making the textchas a little more complicated and also starting a process to accept people to the trusted editor group would solve the problem.
It would be nice to know which textchas the spammers managed to solve, but again I recommend non-trivial challenge questions. The MoinMoin Wiki doesn't suffer from this. Paul P.S. I'm willing to work on more advanced anti-spam measures, but only if they get used.
On 10.06.2013 15:15, Paul Boddie wrote:
On Friday 22 March 2013 12:19:05 M.-A. Lemburg wrote:
On 20.03.2013 15:30, Barry Warsaw wrote:
On Mar 20, 2013, at 10:08 AM, M.-A. Lemburg wrote:
If we can't get things under control, we will have to start using such a group and disable public editing of pages :-(
FWIW, we've long had this policy on the Mailman wiki. It was the only successful way to control spam. Sign up is open (and yes, we occasionally have to delete profile spam), but folks have to email the mailman-cabal to request write/edit access. We don't get a lot of such requests, so the process is quite manageable, and we get almost zero spam now.
Did this have an effect on the number of editors of the wiki ?
The usual complaint when doing this is that you prevent quick edits (e.g. typo corrections) by raising the bar in this way.
After the recent updates to the textchas, the profile spam has apparently stopped:
That no longer seems to be the case. I am also having to remove these stupid profile spams from the Python Wiki two or more times per day.
So perhaps making the textchas a little more complicated and also starting a process to accept people to the trusted editor group would solve the problem.
It would be nice to know which textchas the spammers managed to solve, but again I recommend non-trivial challenge questions. The MoinMoin Wiki doesn't suffer from this.
Even though it may seem not to help as well as before, the TextChas still prevent lots of attempts. The logs are full of TextCha failures. The ratio of success to failures in the current log file (which was started yesterday at 6am UTC) is 16:9677 (!)
P.S. I'm willing to work on more advanced anti-spam measures, but only if they get used.
In the long run, this would be a good solution. I'd certainly be willing to add such a solution to the wiki setup. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 10 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-07-01: EuroPython 2013, Florence, Italy ... 21 days to go 2013-07-16: Python Meeting Duesseldorf ... 36 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Monday 10 June 2013 16:44:27 M.-A. Lemburg wrote:
Even though it may seem not to help as well as before, the TextChas still prevent lots of attempts. The logs are full of TextCha failures.
The ratio of success to failures in the current log file (which was started yesterday at 6am UTC) is 16:9677 (!)
Impressive!
P.S. I'm willing to work on more advanced anti-spam measures, but only if they get used.
In the long run, this would be a good solution. I'd certainly be willing to add such a solution to the wiki setup.
I can think of one useful one straight away, but I don't think it is supported in Moin (or any extensions) yet. What I'd like you to tell me, if you don't mind, is the request sequence of any of the successful spam edits. Various other publishing systems provide some measures that use this information, and I'll develop one for Moin if it seems viable. By all means send me the request details in private. Thanks, Paul
On 19.03.2013 09:14, M.-A. Lemburg wrote:
On 18.03.2013 19:58, Paul Boddie wrote:
On Monday 18 March 2013 19:00:37 Radomir Dopieralski wrote:
On Mon, Mar 18, 2013 at 6:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
You can see that it is enabled by visiting this link: http://wiki.python.org/moin/FrontPage?action=newaccount
(make sure you are logged out)
As for the questions, I'm open to suggestions.
I don't think "How many words are in this question?" is really setting the bar very high for spammers. Textcha questions are supposed to retain the context of the site on which they are placed so that bulk spamming cannot just scrape the question and serve it up to someone on some other site. This means that we should be asking Python-related questions, not simple "Are you human?" questions that ceased to be effective about ten years ago.
Reimar is currently running a test on the Jython wiki. He marked "http" as bad content, which results in all edits including that word to get rejected.
At least on the Jython wiki, this has apparently stopped the spam pages from getting created:
http://wiki.python.org/jython/RecentChanges
It's not a permanent solution, though, since it prevents adding links to pages.
The experiment has resulted in the spam being stopped. Unfortunately, it also prohibited any edits of pages with links on them - even by regular wiki users. I've removed the http again and will add a new set of textchas for now. -- Marc-Andre Lemburg PSF Vice Chairman
On 20.03.2013 09:44, M.-A. Lemburg wrote:
On 19.03.2013 09:14, M.-A. Lemburg wrote:
On 18.03.2013 19:58, Paul Boddie wrote:
On Monday 18 March 2013 19:00:37 Radomir Dopieralski wrote:
On Mon, Mar 18, 2013 at 6:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
You can see that it is enabled by visiting this link: http://wiki.python.org/moin/FrontPage?action=newaccount
(make sure you are logged out)
As for the questions, I'm open to suggestions.
I don't think "How many words are in this question?" is really setting the bar very high for spammers. Textcha questions are supposed to retain the context of the site on which they are placed so that bulk spamming cannot just scrape the question and serve it up to someone on some other site. This means that we should be asking Python-related questions, not simple "Are you human?" questions that ceased to be effective about ten years ago.
Reimar is currently running a test on the Jython wiki. He marked "http" as bad content, which results in all edits including that word to get rejected.
At least on the Jython wiki, this has apparently stopped the spam pages from getting created:
http://wiki.python.org/jython/RecentChanges
It's not a permanent solution, though, since it prevents adding links to pages.
The experiment has resulted in the spam being stopped. Unfortunately, it also prohibited any edits of pages with links on them - even by regular wiki users.
I've removed the http again and will add a new set of textchas for now.
Within a few minutes of removing the "http", the spam started rolling in again. I hope the new textchas will raise the bar a bit. -- Marc-Andre Lemburg PSF Vice Chairman
In order to protect our wikis against excessive spam, we are using textchas (short questions and answers) which users have to answer before they can commit their changes or sign up to the sites. I found that short Python snippet work well, since they require domain knowledge that you cannot simply look up in Wikipedia, e.g. "l = [1,2,3]; del l[1]; l[0] ==". Would be great if you could send me some more ideas for such textchas (please don't post the answers to this list; just posting the questions is fine). PS: Please make sure that the questions work in both Python 2 and 3 and produce the same or at least similar answers. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-03-13: Released eGenix pyOpenSSL 0.13 ... http://egenix.com/go39 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 18.03.2013 18:51, Paul Boddie wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
We've been seeing the same development on the Jython wiki. I guess they just realized that the Python wiki will likely get them even more Google juice: http://wiki.python.org/jython/RecentChanges Unfortunately, the wiki spam appears to from real humans, so textchas don't really help much. I've also checked IP ranges, but it doesn't help either. Of course, ideas as welcome :-) We may end up having to require people to take some extra step in order to open an account on the wikis. Unfortunately, that makes it harder for non-spam wiki editors to sign up as well. I can add new textchas, if you like. Please send them directly to me. Alternatively, it'd probably be a good idea to get you access to the wiki VM, so you can edit them directly. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 18 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
* M.-A. Lemburg <mal@egenix.com>:
We've been seeing the same development on the Jython wiki. I guess they just realized that the Python wiki will likely get them even more Google juice:
http://wiki.python.org/jython/RecentChanges
Unfortunately, the wiki spam appears to from real humans, so textchas don't really help much.
I've also checked IP ranges, but it doesn't help either.
Of course, ideas as welcome :-)
Maybe block using some blacklists (we're using sbl.spamhaus.org) using mod-spamhaus or similar modules. -- Ralf Hildebrandt Charite Universitätsmedizin Berlin ralf.hildebrandt@charite.de Campus Benjamin Franklin http://www.charite.de Hindenburgdamm 30, 12203 Berlin Geschäftsbereich IT, Abt. Netzwerk fon: +49-30-450.570.155
On 18.03.2013 19:03, Ralf Hildebrandt wrote:
* M.-A. Lemburg <mal@egenix.com>:
We've been seeing the same development on the Jython wiki. I guess they just realized that the Python wiki will likely get them even more Google juice:
http://wiki.python.org/jython/RecentChanges
Unfortunately, the wiki spam appears to from real humans, so textchas don't really help much.
I've also checked IP ranges, but it doesn't help either.
Of course, ideas are welcome :-)
Maybe block using some blacklists (we're using sbl.spamhaus.org) using mod-spamhaus or similar modules.
Do they have RBLs for wikis ? I thought they only do email blacklisting. Their website appears to be down at the moment. I'll check again later. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 18 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 18/03/2013 18:00, M.-A. Lemburg wrote:
We've been seeing the same development on the Jython wiki. I guess they just realized that the Python wiki will likely get them even more Google juice:
http://wiki.python.org/jython/RecentChanges
Of course, ideas as welcome :-)
I've had success with Akismet: http://akismet.com/ Not that hard to wire in for a blog, maybe the same could be done for wiki edits? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk
Does MoinMoin support moderated edits by new users? -- anatoly t. On Mon, Mar 18, 2013 at 8:51 PM, Paul Boddie <paul@boddie.org.uk> wrote:
Hello,
We're getting a fair amount of spam on the Python Wiki. Can someone with administrative privileges check that the textcha feature is enabled and has been given a set of effective questions?
Thanks,
Paul _______________________________________________ pydotorg-www mailing list pydotorg-www@python.org http://mail.python.org/mailman/listinfo/pydotorg-www
On Monday 18 March 2013 21:56:04 anatoly techtonik wrote:
Does MoinMoin support moderated edits by new users?
Not out of the box, at least to my knowledge, but I have written an extension that puts edits in an approval queue for each page if the contributor is not part of the approved contributor group. That said, just asking more sophisticated textcha questions will mostly take care of this problem, as we saw with this Wiki before. Paul
participants (8)
-
anatoly techtonik -
Barry Warsaw -
Chris Withers -
M.-A. Lemburg -
M.-A. Lemburg -
Paul Boddie -
Radomir Dopieralski -
Ralf Hildebrandt