[Jenkins-infra] Revisiting access control for anonymized "census data"

Kohsuke Kawaguchi kk at kohsuke.org
Tue Jun 7 23:55:52 UTC 2016


I meant INFRA-682 for "anonymized data" and not for "monthly data" that's
currently in the /census. See
https://wiki.jenkins-ci.org/display/JENKINS/Usage+Statistics for what those
two terms mean. The former contains much richer data than the latter. Do
you still feel the same way with the anonymized data?

I agree that at this point access control on the monthly data seems largely
unneeded.

Also, just for the record, one of the motivations for putting it behind the
access control wall is so that we can know who is looking at the data so
that we can encourage them to bring the results to the community.


On Tue, Jun 7, 2016 at 11:23 AM R. Tyler Croy <tyler at monkeypox.org> wrote:

>
> For a long-long time "census data", anonymized usage collection
> information,
> was shared under jenkins-ci.org/census to those who asked. The path,
> /census/
> was guarded by a simple HTTP Basic Auth prompt and a user/password
> combination
> that was so readily shared that you can find it in the mailing list
> archives.
>
>
> In the process of revamping this stats collection and delivery
> infrastructure,
> we have migrated to census.jenkins.io and I have the following ticket
> assigned
> to me: <https://issues.jenkins-ci.org/browse/INFRA-682>
>
>     "Write down process to request & grant access to census.jenkins.io"
>
>
> In thinking about this ticket, and discussing a bit more with abayer, I'm
> not
> convinced that "census access control" makes sense anymore. If memory
> serves,
> the reason we originally put a gate in front was because we were not
> confident
> in the anonymization of our data. This was ages ago, and I'm fairly
> confident
> /now/ in it :)
>
>
> The other concern behind access control was preventing bandwidth abuses, by
> clients downloading massive datasets over and over again. I'm not convinced
> this is an issue anymore with our current infrastructure and as we have
> shown
> with archives.jenkins-ci.org, we're more than capable of throttling
> download
> traffic.
>
>
> SO HERE'S MY STUPID PROPOSAL:
>
>  * License the dataset under the Open Database License 1.0
>     http://opendatacommons.org/licenses/odbl/1.0/)
>  * Remove access controls to census data (but introduce some throttling to
>    prevent abuses)
>
>
>
> That's kind of it :)
>
>
>
>
> Cheers
> - R. Tyler Croy
>
> ------------------------------------------------------
>      Code: <https://github.com/rtyler>
>   Chatter: <https://twitter.com/agentdero>
>
>   % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
> ------------------------------------------------------
> _______________________________________________
> Jenkins-infra mailing list
> Jenkins-infra at lists.jenkins-ci.org
> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20160607/6bb067ce/attachment.html>


More information about the Jenkins-infra mailing list