[Jenkins-infra] Update 'Confluence is choking'

Olblak me at olblak.com
Mon Mar 27 10:08:14 UTC 2017


I would like to modify 'confluence is choking' datadog warning as it
doesn't totally reflect health status.
Before doing that I want to be sure that nobody rely on it. 

Currently this check use the number of apache busy workers and trigger a
warning if it's greater that 145. 
which doesn't mean that something bad is happening but only that traffic
is increasing.

What can goes wrong is when we reach top workers limit (250 with current
All new Http requests are queued until a worker is available to process
the request.
Meanwhile Response time will increase for queued requests.

It's hard to increase workers limit at the moment as more workers mean
more cpu/memory usage 
and the host is already running out of them.

I also suggest to reduce 'Confluence is slow' warning from 3sec to
0.1sec in order to  capture anomalies


More information about the Jenkins-infra mailing list