[Jenkins-infra] Flighting with wiki spam

Larry Shatzer, Jr. larrys at gmail.com
Sun Mar 8 18:31:39 UTC 2015


I wonder if you can have the bot also do two other things when it deletes
the page. Purge the trash for that page (not all the trash for the space,
but just the page)... Since that will delete the attachments (if any on the
page). That has been another step I've been doing when I was manually
cleaning up spam. Also if it is possible to invalidate their session, this
works great if their login is also deleted at the same time, since it will
slow them down, to either have to log back in, or try to create a new
account. I've seen accounts that I've deleted still create pages until the
synch with LDAP happens and really removes their account from Confluence.

On Fri, Mar 6, 2015 at 11:14 PM, Kohsuke Kawaguchi <kk at kohsuke.org> wrote:

> I started playing with this idea.
>
> I set up a mailing list
> <https://groups.google.com/forum/#!forum/jenkinsci-spambot>, feed wiki
> notifications in here, and get a bot running. Right now, the bot tries to
> determine whether the new page addition in Japanese, English, or
> Indonesian, and just reply that info back to the list.
>
> I'm going to keep it like that for a few days to make sure it's detecting
> accurately, then I can implement the auto page removal.
>
> I haven't yet implemented the page removal by reply. That'll come later.
>
>
> 2015-03-02 12:59 GMT-08:00 Larry Shatzer, Jr. <larrys at gmail.com>:
>
> I like the idea of spreading the load around, and possibly automating it
>> via email (or irc) to fight spam.
>>
>> -- Larry
>>
>> On Mon, Mar 2, 2015 at 1:40 PM, Kohsuke Kawaguchi <kk at kohsuke.org> wrote:
>>
>>> This is just an idea.
>>>
>>> I was thinking about how we can cope more effectively with Wiki spam,
>>> and spread that workload.
>>>
>>> What if we establish a mailing list based workflow? We'll create a
>>> mailing list that spam fighters will join, and this list receives the
>>> notifications from Confluence about new pages.
>>>
>>> We'll have a bot monitor this list as well, and if it sees us replying
>>> to a notification email with some keyword, say "BURN IN HELL", it'll go
>>> delete that page. I think this simplifies the workflow for us humans quite
>>> a bit, and it'll make it easier for multiple people to collaborate on this
>>> task. The invitation only ML would serve as a kind of authentication
>>> mechanism, to prevent the bot from going nuts.
>>>
>>> The bot could evolve to do more actions, such as removing the user from
>>> LDAP and perhaps feeding that information back to stopforumspam.
>>>
>>> I've also experimented with a language detection library, and it seems
>>> to work well. So our bot could automatically delete all new pages if it's
>>> judged Indonesian beyond 99%+ confidence level, and it could auto-reply to
>>> that list saying it deleted the page.
>>>
>>> The accumulated archive will serve as a nice record of action to analyze
>>> later.
>>>
>>> Is something like this useful?
>>>
>>> --
>>> Kohsuke Kawaguchi
>>>
>>
>>
>
>
> --
> Kohsuke Kawaguchi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20150308/bb87efd0/attachment.html>


More information about the Jenkins-infra mailing list