[Jenkins-infra] Flighting with wiki spam

Kohsuke Kawaguchi kk at kohsuke.org
Sat Mar 7 06:14:52 UTC 2015


I started playing with this idea.

I set up a mailing list
<https://groups.google.com/forum/#!forum/jenkinsci-spambot>, feed wiki
notifications in here, and get a bot running. Right now, the bot tries to
determine whether the new page addition in Japanese, English, or
Indonesian, and just reply that info back to the list.

I'm going to keep it like that for a few days to make sure it's detecting
accurately, then I can implement the auto page removal.

I haven't yet implemented the page removal by reply. That'll come later.


2015-03-02 12:59 GMT-08:00 Larry Shatzer, Jr. <larrys at gmail.com>:

> I like the idea of spreading the load around, and possibly automating it
> via email (or irc) to fight spam.
>
> -- Larry
>
> On Mon, Mar 2, 2015 at 1:40 PM, Kohsuke Kawaguchi <kk at kohsuke.org> wrote:
>
>> This is just an idea.
>>
>> I was thinking about how we can cope more effectively with Wiki spam, and
>> spread that workload.
>>
>> What if we establish a mailing list based workflow? We'll create a
>> mailing list that spam fighters will join, and this list receives the
>> notifications from Confluence about new pages.
>>
>> We'll have a bot monitor this list as well, and if it sees us replying to
>> a notification email with some keyword, say "BURN IN HELL", it'll go delete
>> that page. I think this simplifies the workflow for us humans quite a bit,
>> and it'll make it easier for multiple people to collaborate on this task.
>> The invitation only ML would serve as a kind of authentication mechanism,
>> to prevent the bot from going nuts.
>>
>> The bot could evolve to do more actions, such as removing the user from
>> LDAP and perhaps feeding that information back to stopforumspam.
>>
>> I've also experimented with a language detection library, and it seems to
>> work well. So our bot could automatically delete all new pages if it's
>> judged Indonesian beyond 99%+ confidence level, and it could auto-reply to
>> that list saying it deleted the page.
>>
>> The accumulated archive will serve as a nice record of action to analyze
>> later.
>>
>> Is something like this useful?
>>
>> --
>> Kohsuke Kawaguchi
>>
>
>


-- 
Kohsuke Kawaguchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20150306/8f09e3bb/attachment.html>


More information about the Jenkins-infra mailing list