[Jenkins-infra] Update 'Confluence is choking'
me at olblak.com
Mon Mar 27 10:08:14 UTC 2017
I would like to modify 'confluence is choking' datadog warning as it
doesn't totally reflect health status.
Before doing that I want to be sure that nobody rely on it.
Currently this check use the number of apache busy workers and trigger a
warning if it's greater that 145.
which doesn't mean that something bad is happening but only that traffic
What can goes wrong is when we reach top workers limit (250 with current
All new Http requests are queued until a worker is available to process
Meanwhile Response time will increase for queued requests.
It's hard to increase workers limit at the moment as more workers mean
more cpu/memory usage
and the host is already running out of them.
I also suggest to reduce 'Confluence is slow' warning from 3sec to
0.1sec in order to capture anomalies
More information about the Jenkins-infra