[Jenkins-infra] Confluence instability post-mortem

Kohsuke Kawaguchi kk at kohsuke.org
Tue Mar 24 01:59:56 UTC 2015


As I was about to issue a security advisory, I've noticed that Confluence
is acting up. It accepts inbound HTTP connections, but very slowly, and
then even if it accepts connections, it fails to render HTML in a timely
manner.

Now I think I know what's going on.

The way the system is put together is that there's Apache at the very
front, and requests for wiki is forwarded to nginx that acts as the cache
layer. If a cache fails, nginx further forwards the request to Tomcat,
which runs Confluence.

I think the root cause of the problem is that /srv/wiki/cache weren't fully
populated. I've discovered this at the very end, and I still don't know why
this was only partially populated, but this explains everything.

Normally, the cache tier responds to most requests. But now that the cahe
is gone, Confluence takes far more load than usual. Unfortunately, Tomcat
was configured to spin up to 200 request handling threads, yet it only had
15 DB connections in the pool. So almost all of 200 request handling
threads all ended up competing for available database connections. This was
quite visible in the thread dump.

I've made the change to double the DB connecton pool size to 30 as per this
KB document
<https://confluence.atlassian.com/display/CONFKB/Confluence+Slows+and+Times+Out+During+Periods+of+High+Load+Due+to+DB+Connection+Pool>
(which
had to be be done outside Puppet as this file contains passwords and so
cannot be managed in infra-puppet), and reduced the # of maximum request
handling threads from 200 to 75. In this way, even if Confluence sees
increased load, it doesn't end up taking too many connections that it
cannot serve.

I've also issued re-generation of static cache. Confluence CPU usage is
down and the site is mostly snappy, and it'll get better as the static
cache fills up.


-- 
Kohsuke Kawaguchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20150323/9e1e51bb/attachment.html>


More information about the Jenkins-infra mailing list