[Jenkins-infra] Confluence, Nginx and 99 reasons Docker hates me: a report.

Larry Shatzer, Jr. larrys at gmail.com
Mon Jan 18 17:43:03 UTC 2016


We need to get that running on a box more people have access to so we can
kick it when it does this. It is becoming more critical to keep spam away.
I do not want to spend all my day off deleting spam.

On Mon, Jan 18, 2016 at 8:44 AM, R. Tyler Croy <tyler at monkeypox.org> wrote:

> (replies inline)
>
> On Mon, 18 Jan 2016, Larry Shatzer, Jr. wrote:
>
> > And now it appears the wiki spam bot is dead, since there are probably
> > around a thousand spam pages, all ones that would normally be killed by
> it.
>
>
> Due to access constraints, I didn't touch the backend spam bot at all for
> what
> it's worth. It may have cratered due to the connectivity issues though.
> I'll
> ping KK as soon as I see him online about it.
>
>
> > On Sun, Jan 17, 2016 at 8:43 PM, R. Tyler Croy <tyler at monkeypox.org>
> wrote:
> >
> > >
> > > We had some wiki availability issues today, that were partially my
> fault
> > > and
> > > partially related to trying to bring "build13" of the docker confluence
> > > image
> > > into production.
> > >
> > > KK made the change earlier last week to disable LDAP caching but for
> some
> > > reason Docker wasn't pulling the new container properly. This is what I
> > > set out
> > > to fix about 6 hours ago.
> > >
> > >
> > > First, I discovered that newer versions of Docker had no problem
> pulling
> > > the
> > > docker container and we did not have consistent versions of Docker
> > > installed
> > > across our machines (1.5.0, 1.7.0 and 1.9.1 by my survey). With this
> > > commit[1]
> > > I ensured that we would have 1.9.1 consistently installed. This
> required
> > > some
> > > changes to the forked version of garethr-docker puppet module we use
> since
> > > it's
> > > been changed quite a bit to accomodate newer options in later Docker
> > > versions.
> > >
> > >
> > > COOL, surely that must have been the end of my day.
> > >
> > >
> > > Second, after rolling out the Docker changes the wiki became
> unavailable.
> > > Investigation led to two problems, one I have seen before with Docker
> a few
> > > times already in our infrastructure: stale IPTables routing rules. When
> > > Docker
> > > sets up its networking it will install some rules into a couple chains
> in
> > > the
> > > `filter` and `nat` tables, periodically it has failed to clean up these
> > > rules
> > > leading to requests not being routed between confluence-cache and
> > > confluence
> > > containers. The second problem I identified was that there was an
> internal
> > > IP
> > > address hard-coded for the confluence-cache container, which no longer
> > > existed,
> > > so naturally it wasn't finding the right confluence container. I
> addressed
> > > *that* with this[2] change.
> > >
> > >
> > > While debugging this, I noticed another cute behavior of docker with
> it's
> > > named
> > > containers support. Since we name our containers (e.g. `confluence`),
> the
> > > docker daemon will actually persist the tag and some of the options
> passed
> > > into
> > > the `docker run` invocation. I.e. `docker run -e SOME=foo --name
> > > bleepbloop rtyler/myimage`
> > > would persist the environment variable options (SOME=foo) until I
> stopped
> > > and
> > > removed the container (e.g. `docker rm bleepbloop`)
> > >
> > > To remedy this, I nuked all the previous incantations of named
> containers
> > > from
> > > the host running confluence. That finished, I could FINALLY run
> `build13`
> > > of
> > > the confluence container which had the LDAP cache setting change that
> KK
> > > made
> > > earlier. Bringing that up I discovered another issue..
> > >
> > >
> > > Third, lots of spammers and bots are regularly hitting the wiki which I
> > > suspected was causing confluence not to come online and stay online,
> so I
> > > made
> > > this commit[3] to deny those bots at the Apache proxy level (refresher,
> > > requests go: Apache (ssl termination) -> Nginx (cache) -> Confluence)
> > >
> > >
> > > All that said and done, it still does not appear that the current
> > > configuration
> > > of Confluence can sustain the traffic levels without LDAP caching
> enabled,
> > > so I
> > > unfortunately have pinned things back down to `build7`
> > >
> > >
> > >
> > > You may be asking yourself at this point of the email: "why is he
> writing
> > > all
> > > this out?" Welp, this is effectively what I spent my Sunday doing, and
> it
> > > would
> > > be a shame if nobody but me learned from this collosal waste of time.
> :)
> > >
> > >
> > > Anywho, that's that. Confluence is back online, and I'm probably not
> going
> > > to
> > > touch it for at least a few days, lest I go crazy.
> > >
> > >
> > > [1]
> > >
> https://github.com/jenkins-infra/jenkins-infra/commit/0107e79b0aa7b5bd9acd3d4d6b268c4178331beb
> > > [2]
> > >
> https://github.com/jenkins-infra/jenkins-infra/commit/f95c0e67803e9129c54a3f7fe8fce2940f7ad874
> > > [3]
> > >
> https://github.com/jenkins-infra/jenkins-infra/commit/675e4bdfc7bdd96b34046dc872f73f7f514e4e49
> > >
> > >
> > > Cheers
> > > - R. Tyler Croy
> > >
> > > ------------------------------------------------------
> > >      Code: <https://github.com/rtyler>
> > >   Chatter: <https://twitter.com/agentdero>
> > >
> > >   % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
> > > ------------------------------------------------------
> > >
> > > _______________________________________________
> > > Jenkins-infra mailing list
> > > Jenkins-infra at lists.jenkins-ci.org
> > > http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
> > >
> > >
>
> - R. Tyler Croy
>
> ------------------------------------------------------
>      Code: <https://github.com/rtyler>
>   Chatter: <https://twitter.com/agentdero>
>
>   % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
> ------------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20160118/98e0195a/attachment-0001.html>


More information about the Jenkins-infra mailing list