[Jenkins-infra] Flighting with wiki spam

Larry Shatzer, Jr. larrys at gmail.com
Sun Mar 8 23:19:31 UTC 2015


Since I just deleted a comment spam, scanning comment additions might be a
good idea too.

On Sun, Mar 8, 2015 at 1:43 PM, Kohsuke Kawaguchi <kk at kohsuke.org> wrote:

> It all makes sense. The main challenge is whether Confluence exposes those
> actions via API.
>
> (I started auto-deleting pages that are determined to be Indonesian with
> 99% confidence level.)
>
> 2015-03-08 11:31 GMT-07:00 Larry Shatzer, Jr. <larrys at gmail.com>:
>
> I wonder if you can have the bot also do two other things when it deletes
>> the page. Purge the trash for that page (not all the trash for the space,
>> but just the page)... Since that will delete the attachments (if any on the
>> page). That has been another step I've been doing when I was manually
>> cleaning up spam. Also if it is possible to invalidate their session, this
>> works great if their login is also deleted at the same time, since it will
>> slow them down, to either have to log back in, or try to create a new
>> account. I've seen accounts that I've deleted still create pages until the
>> synch with LDAP happens and really removes their account from Confluence.
>>
>> On Fri, Mar 6, 2015 at 11:14 PM, Kohsuke Kawaguchi <kk at kohsuke.org>
>> wrote:
>>
>>> I started playing with this idea.
>>>
>>> I set up a mailing list
>>> <https://groups.google.com/forum/#!forum/jenkinsci-spambot>, feed wiki
>>> notifications in here, and get a bot running. Right now, the bot tries to
>>> determine whether the new page addition in Japanese, English, or
>>> Indonesian, and just reply that info back to the list.
>>>
>>> I'm going to keep it like that for a few days to make sure it's
>>> detecting accurately, then I can implement the auto page removal.
>>>
>>> I haven't yet implemented the page removal by reply. That'll come later.
>>>
>>>
>>> 2015-03-02 12:59 GMT-08:00 Larry Shatzer, Jr. <larrys at gmail.com>:
>>>
>>> I like the idea of spreading the load around, and possibly automating it
>>>> via email (or irc) to fight spam.
>>>>
>>>> -- Larry
>>>>
>>>> On Mon, Mar 2, 2015 at 1:40 PM, Kohsuke Kawaguchi <kk at kohsuke.org>
>>>> wrote:
>>>>
>>>>> This is just an idea.
>>>>>
>>>>> I was thinking about how we can cope more effectively with Wiki spam,
>>>>> and spread that workload.
>>>>>
>>>>> What if we establish a mailing list based workflow? We'll create a
>>>>> mailing list that spam fighters will join, and this list receives the
>>>>> notifications from Confluence about new pages.
>>>>>
>>>>> We'll have a bot monitor this list as well, and if it sees us replying
>>>>> to a notification email with some keyword, say "BURN IN HELL", it'll go
>>>>> delete that page. I think this simplifies the workflow for us humans quite
>>>>> a bit, and it'll make it easier for multiple people to collaborate on this
>>>>> task. The invitation only ML would serve as a kind of authentication
>>>>> mechanism, to prevent the bot from going nuts.
>>>>>
>>>>> The bot could evolve to do more actions, such as removing the user
>>>>> from LDAP and perhaps feeding that information back to stopforumspam.
>>>>>
>>>>> I've also experimented with a language detection library, and it seems
>>>>> to work well. So our bot could automatically delete all new pages if it's
>>>>> judged Indonesian beyond 99%+ confidence level, and it could auto-reply to
>>>>> that list saying it deleted the page.
>>>>>
>>>>> The accumulated archive will serve as a nice record of action to
>>>>> analyze later.
>>>>>
>>>>> Is something like this useful?
>>>>>
>>>>> --
>>>>> Kohsuke Kawaguchi
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Kohsuke Kawaguchi
>>>
>>
>>
>
>
> --
> Kohsuke Kawaguchi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20150308/b179b7ba/attachment-0001.html>


More information about the Jenkins-infra mailing list