[Jenkins-infra] Revisiting access control for anonymized "census data"

Andrew Bayer andrew.bayer at gmail.com
Wed Jun 8 01:32:13 UTC 2016


Fwiw, the census data is just a subset of the full anonymized data - all we
remove are instance reports with no jobs and instances that only report
(with jobs) once in a month. The data for each record is the same.

A.

On Tuesday, June 7, 2016, Vivek Pandey <vivek.pandey at gmail.com> wrote:

> The anonymized data looks pretty well anonymized, so do not see anything
> in there that needs any confidentiality as far as ACL is concerned, and I
> am fine with the proposed licensing.
>
> On Tue, Jun 7, 2016 at 5:10 PM, R. Tyler Croy <tyler at monkeypox.org
> <javascript:_e(%7B%7D,'cvml','tyler at monkeypox.org');>> wrote:
>
>> (replies inline)
>>
>> On Tue, 07 Jun 2016, Kohsuke Kawaguchi wrote:
>>
>> > I meant INFRA-682 for "anonymized data" and not for "monthly data"
>> that's
>> > currently in the /census. See
>> > https://wiki.jenkins-ci.org/display/JENKINS/Usage+Statistics for what
>> those
>> > two terms mean. The former contains much richer data than the latter. Do
>> > you still feel the same way with the anonymized data?
>>
>>
>> Well, the title refers to "Write down process to request & grant access to
>> census.jenkins.io"
>>
>> Only census.json.gz data is on census.jenkins.io.
>>
>> We do not have currently, nor have we ever had the infrastructure for
>> serving
>> up the raw anonymized access logs, but fundamentally I don't see access
>> controls as necessary for that data either. I would rather we did not
>> expand
>> the scope of what's being discussed here however.
>>
>> > I agree that at this point access control on the monthly data seems
>> largely
>> > unneeded.
>> >
>> > Also, just for the record, one of the motivations for putting it behind
>> the
>> > access control wall is so that we can know who is looking at the data so
>> > that we can encourage them to bring the results to the community.
>>
>>
>> I vaguely recall that, unfortunately I don't think our hopes were ever
>> realized
>> there :/
>>
>> I hope/think the databsae licensing that I proposed would help encourage
>> more
>> open data mining and stats digging.
>>
>>
>> - R. Tyler Croy
>>
>> ------------------------------------------------------
>>      Code: <https://github.com/rtyler>
>>   Chatter: <https://twitter.com/agentdero>
>>
>>   % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
>> ------------------------------------------------------
>>
>> _______________________________________________
>> Jenkins-infra mailing list
>> Jenkins-infra at lists.jenkins-ci.org
>> <javascript:_e(%7B%7D,'cvml','Jenkins-infra at lists.jenkins-ci.org');>
>> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20160607/8e6aa3cf/attachment-0001.html>


More information about the Jenkins-infra mailing list