[Jenkins-infra] Revisiting access control for anonymized "census data"

Vivek Pandey vivek.pandey at gmail.com
Wed Jun 8 05:52:19 UTC 2016


To put anonymize do data in context, his is what I am referring to:
https://wiki.jenkins-ci.org/display/JENKINS/Usage+Statistics


Sent from my iPhone

> On Jun 7, 2016, at 6:32 PM, Andrew Bayer <andrew.bayer at gmail.com> wrote:
> 
> Fwiw, the census data is just a subset of the full anonymized data - all we remove are instance reports with no jobs and instances that only report (with jobs) once in a month. The data for each record is the same.
> 
> A.
> 
>> On Tuesday, June 7, 2016, Vivek Pandey <vivek.pandey at gmail.com> wrote:
>> The anonymized data looks pretty well anonymized, so do not see anything in there that needs any confidentiality as far as ACL is concerned, and I am fine with the proposed licensing.
>> 
>>> On Tue, Jun 7, 2016 at 5:10 PM, R. Tyler Croy <tyler at monkeypox.org> wrote:
>>> (replies inline)
>>> 
>>> On Tue, 07 Jun 2016, Kohsuke Kawaguchi wrote:
>>> 
>>> > I meant INFRA-682 for "anonymized data" and not for "monthly data" that's
>>> > currently in the /census. See
>>> > https://wiki.jenkins-ci.org/display/JENKINS/Usage+Statistics for what those
>>> > two terms mean. The former contains much richer data than the latter. Do
>>> > you still feel the same way with the anonymized data?
>>> 
>>> 
>>> Well, the title refers to "Write down process to request & grant access to
>>> census.jenkins.io"
>>> 
>>> Only census.json.gz data is on census.jenkins.io.
>>> 
>>> We do not have currently, nor have we ever had the infrastructure for serving
>>> up the raw anonymized access logs, but fundamentally I don't see access
>>> controls as necessary for that data either. I would rather we did not expand
>>> the scope of what's being discussed here however.
>>> 
>>> > I agree that at this point access control on the monthly data seems largely
>>> > unneeded.
>>> >
>>> > Also, just for the record, one of the motivations for putting it behind the
>>> > access control wall is so that we can know who is looking at the data so
>>> > that we can encourage them to bring the results to the community.
>>> 
>>> 
>>> I vaguely recall that, unfortunately I don't think our hopes were ever realized
>>> there :/
>>> 
>>> I hope/think the databsae licensing that I proposed would help encourage more
>>> open data mining and stats digging.
>>> 
>>> 
>>> - R. Tyler Croy
>>> 
>>> ------------------------------------------------------
>>>      Code: <https://github.com/rtyler>
>>>   Chatter: <https://twitter.com/agentdero>
>>> 
>>>   % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
>>> ------------------------------------------------------
>>> 
>>> _______________________________________________
>>> Jenkins-infra mailing list
>>> Jenkins-infra at lists.jenkins-ci.org
>>> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20160607/1a6ec0ce/attachment.html>


More information about the Jenkins-infra mailing list