[Jenkins-infra] INFRA-240 is fixed / post-mortem
R. Tyler Croy
tyler at monkeypox.org
Mon Feb 16 19:02:29 UTC 2015
On Mon, 16 Feb 2015, Kohsuke Kawaguchi wrote:
> In the last 6 months or so, we've handed out infra acecss right to a few
> more people (Daniel Beck and Oleg Nanoshev, IIRC), and that was good for
> better time zone coverage and what not. But the problem still remains that
> there is a leadership vacuum, that no one sufficiently "owns" the infra,
> and that's difficult to solve by adding more hands alone.
> So here's what I'd like to propose:
> - Formalize our ops team more by designating the lead that reports to
> the board. The lead shall be chosen in the discussion during the project
> - Under the new lead, accept another round of ops team members to help
> spread the workload. I know for example Kostasya is interested in helping.
> - Kohsuke (and Tyler if he can join) and the ops team will schedule a
> series of "transfer of information" sessions to bring the new ops lead and
> the team up to speed about how things are put together today.
> - Identify and remove single-point-of-failure in our infra. Off the top
> of my head:
> - I think I'm currently the only one who has the private key to sign
> update center root CA.
> - jenkins-ci.org domain name still appears to be registered under
> Tyler's personal account.
> As the ops lead, I'd like the project to consider Adam Papai
> <https://github.com/woohgit>. He's been a long time user of Jenkins and he
> is a member of the CloudBees ops team. I'm sensitive to the fact that he
> works for CloudBees and how that can come across, but OTOH this will be a
> part of his day job, and I think that ensures that he can allocate
> necessary time to the effort.
Since i've got a couple of real-world things consuming a boatload of my time, I
don't have any objections to Adam joining the infra team. I'm not sure I like
the term "ops lead" as I've never thought of there being a leadership structure
around our infrastructure so much as a steaming pile of JIRAs and not enough
people to tackle them :-P
I would suggest ramping Adam up in the following ways to mitigate some of our
* Documenting and migrating backend crawlers into the jenkins-infra GH
organization. This is one of the places where I think we have a seriously
low bus factor
* Helping KostySha where I have failed, with feedback on this PR:
* Drive migration of JIRA and Confluence onto the newer hardware and newer
versions we've not been able to complete due to time
There's a long tail of other smaller projects, but in terms of our current
infra health and its affect on the project's continued growth and success, I
think those are the areas of most need.
See you chaps in #jenkins-infra
-R. Tyler Croy
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 181 bytes
Desc: Digital signature
More information about the Jenkins-infra