[Jenkins-infra] Revisiting access control for anonymized "census data"

Baptiste Mathus bmathus at gmail.com
Wed Jun 8 15:32:09 UTC 2016


+1. I've played with it some months ago and agree there's no privacy issue
IMO with the data already previously available.
Le 8 juin 2016 5:14 PM, "Kohsuke Kawaguchi" <kk at kohsuke.org> a écrit :

> Given all the +1s from people who've actually looked & played with the
> data, I feel much better.
>
> On Tue, Jun 7, 2016 at 10:52 PM Vivek Pandey <vivek.pandey at gmail.com>
> wrote:
>
>> To put anonymize do data in context, his is what I am referring to:
>> https://wiki.jenkins-ci.org/display/JENKINS/Usage+Statistics
>>
>>
>> Sent from my iPhone
>>
>> On Jun 7, 2016, at 6:32 PM, Andrew Bayer <andrew.bayer at gmail.com> wrote:
>>
>> Fwiw, the census data is just a subset of the full anonymized data - all
>> we remove are instance reports with no jobs and instances that only report
>> (with jobs) once in a month. The data for each record is the same.
>>
>> A.
>>
>> On Tuesday, June 7, 2016, Vivek Pandey <vivek.pandey at gmail.com> wrote:
>>
>>> The anonymized data looks pretty well anonymized, so do not see anything
>>> in there that needs any confidentiality as far as ACL is concerned, and I
>>> am fine with the proposed licensing.
>>>
>>> On Tue, Jun 7, 2016 at 5:10 PM, R. Tyler Croy <tyler at monkeypox.org>
>>> wrote:
>>>
>>>> (replies inline)
>>>>
>>>> On Tue, 07 Jun 2016, Kohsuke Kawaguchi wrote:
>>>>
>>>> > I meant INFRA-682 for "anonymized data" and not for "monthly data"
>>>> that's
>>>> > currently in the /census. See
>>>> > https://wiki.jenkins-ci.org/display/JENKINS/Usage+Statistics for
>>>> what those
>>>> > two terms mean. The former contains much richer data than the latter.
>>>> Do
>>>> > you still feel the same way with the anonymized data?
>>>>
>>>>
>>>> Well, the title refers to "Write down process to request & grant access
>>>> to
>>>> census.jenkins.io"
>>>>
>>>> Only census.json.gz data is on census.jenkins.io.
>>>>
>>>> We do not have currently, nor have we ever had the infrastructure for
>>>> serving
>>>> up the raw anonymized access logs, but fundamentally I don't see access
>>>> controls as necessary for that data either. I would rather we did not
>>>> expand
>>>> the scope of what's being discussed here however.
>>>>
>>>> > I agree that at this point access control on the monthly data seems
>>>> largely
>>>> > unneeded.
>>>> >
>>>> > Also, just for the record, one of the motivations for putting it
>>>> behind the
>>>> > access control wall is so that we can know who is looking at the data
>>>> so
>>>> > that we can encourage them to bring the results to the community.
>>>>
>>>>
>>>> I vaguely recall that, unfortunately I don't think our hopes were ever
>>>> realized
>>>> there :/
>>>>
>>>> I hope/think the databsae licensing that I proposed would help
>>>> encourage more
>>>> open data mining and stats digging.
>>>>
>>>>
>>>> - R. Tyler Croy
>>>>
>>>> ------------------------------------------------------
>>>>      Code: <https://github.com/rtyler>
>>>>   Chatter: <https://twitter.com/agentdero>
>>>>
>>>>   % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
>>>> ------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Jenkins-infra mailing list
>>>> Jenkins-infra at lists.jenkins-ci.org
>>>> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
>>>>
>>>>
>>> _______________________________________________
>> Jenkins-infra mailing list
>> Jenkins-infra at lists.jenkins-ci.org
>> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
>>
>
> _______________________________________________
> Jenkins-infra mailing list
> Jenkins-infra at lists.jenkins-ci.org
> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20160608/5249d809/attachment.html>


More information about the Jenkins-infra mailing list