[Jenkins-infra] Cucumber bandwidth situation

Kohsuke Kawaguchi kkawaguchi at cloudbees.com
Mon Jun 2 23:52:42 UTC 2014


More progress.

When the last overage happened we started using mod_logio to record 
total bytes in/out per request [1]. So I looked at which virtual host is 
eating bandwidth, and the result was quite surprising.

Looking at the # of bytes transferred outbound per virtual host per 
week, I get this.

> ci      489,290,628,016
> main     12,203,528,309
> maven     7,378,839,415
> pkg       8,105,888,873
> svn         937,561,148
> updates  17,900,766,883

So ci.jenkins-ci.org has served 490GB/week (!) of data. That's clearly 
too much.

So I looked at its access log and noticed that 64.125.71.142 is 
downloading 
/job/jenkins_rc_branch/lastSuccessfulBuild/artifact/war/target/jenkins.war 
over and over.

This file is about 68MB in size, and a download happens every minute. 
That's about 95GB/day, or 2.8TB/month if extrapolated.

THIS GUY SINGLEHANDEDLY CONSUMED MORE THAN HALF THE BANDWIDTH!



The actual download counts per week is as follows:

     Apr,03  10165
     Apr,10  10171
     Apr,17  7426
     Apr,24  10180
     May,1   11203
     May,8   10226
     May,15  8975
     May,22  10140

There are occasional gaps (perhaps ci.jenkins-ci.org was down, or 
there's some pause in what he does), but still clearly this amounts to a 
ridiculous amount of network transfer.

I'm going to ban this IP in apache config to stop this immediately.


Going forward, we need a better approach to this than waiting for the 
overage to slap our face before noticing the problem. For that I think 
we need to have some monitoring in place to alert us in case of sudden 
traffic increase.

Also, as you see in the log above, this didn't start in May, so it 
doesn't explain the sudden increase in the traffic of May. So we've got 
still more analysis to do.





[1] http://httpd.apache.org/docs/2.2/mod/mod_logio.html

On 06/02/2014 03:39 PM, Kohsuke Kawaguchi wrote:
>
> I looked at output from haproxy
> http://kohsuke.org/private/20140602/haproxy-stats.png, and this shows a
> large amount of activities under "maven", which is
> http://maven.jenkins-ci.org/ that acts as a reverse proxy to
> repo.jenkins-ci.org.
>
> If you look under "bytes out", of 6.8TB that has been served for the
> duration of the haproxy uptime, 1TB is from maven.jenkins-ci.org.
>
> This is surprising because all the download traffic for Maven repository
> should be served through http://repo.jenkins-ci.org/
>
> I need to look into this a bit more.
>
>
> OTOH, note that this shows only the cumulative value. I recorded the
> value 2.5 hours later, and the delta for maven was 169MB (extrapolates
> to 50GB/month) whereas delta for overall is 9.6GB (extrapolates to
> 2.7TB/month.) So it's nowhere near big enough to explain the
> 1.45TB/month usage spike in May.
>
> In addition, the traffic to maven.jenkins-ci.org is reverse-proxied to
> repo.jenkins-ci.org, and so if this accounts for the traffic increase,
> it should show up in the corresponding increase in the RX side. There's
> no such spike in the RX side.
>
> I'll continue digging...
>
>
> On 06/02/2014 11:15 AM, Kohsuke Kawaguchi wrote:
>>
>> Tyler got a surprisingly bill for the overage charge for cucumber, which
>> runs "jenkins-ci.org", "mirrors.jenkins-ci.org" and a number of other
>> virtual hosts.
>>
>> I think I've set up vnstat when it happened the last time to track
>> utlization. Here is the vnstat output.
>>
>>> kohsuke at cucumber:~$ vnstat -m
>>>
>>>  eth0  /  monthly
>>>
>>>        month        rx      |     tx      |    total    |   avg. rate
>>>     ------------------------+-------------+-------------+---------------
>>>       Jul '13    338.35 GiB |    3.21 TiB |    3.54 TiB |   11.36 Mbit/s
>>>       Aug '13    338.36 GiB |    2.86 TiB |    3.19 TiB |   10.23 Mbit/s
>>>       Sep '13    354.39 GiB |    3.52 TiB |    3.87 TiB |   12.82 Mbit/s
>>>       Oct '13    395.09 GiB |    4.20 TiB |    4.59 TiB |   14.72 Mbit/s
>>>       Nov '13    449.73 GiB |    3.51 TiB |    3.94 TiB |   13.07 Mbit/s
>>>       Dec '13    562.26 GiB |    3.68 TiB |    4.23 TiB |   13.56 Mbit/s
>>>       Jan '14    672.19 GiB |    3.91 TiB |    4.56 TiB |   14.64 Mbit/s
>>>       Feb '14    370.69 GiB |    3.13 TiB |    3.49 TiB |   12.39 Mbit/s
>>>       Mar '14    351.83 GiB |    3.33 TiB |    3.67 TiB |   11.77 Mbit/s
>>>       Apr '14    362.76 GiB |    3.39 TiB |    3.74 TiB |   12.40 Mbit/s
>>>       May '14    401.56 GiB |    4.80 TiB |    5.19 TiB |   16.65 Mbit/s
>>>       Jun '14     20.11 GiB |  241.83 GiB |  261.94 GiB |   16.07 Mbit/s
>>>     ------------------------+-------------+-------------+---------------
>>>     estimated    381.25 GiB |    4.48 TiB |    4.85 TiB |
>>
>> As you see, the outbound traffic jumped in May.
>> Here's the daily output, and I think it means that new trend is
>> continuing in June so far as I can tell.
>>
>> In other words, we need to act on it ASAP to avoid another overage for June.
>>
>>
>>> kohsuke at cucumber:~$ vnstat -d
>>>
>>>  eth0  /  daily
>>>
>>>          day         rx      |     tx      |    total    |   avg. rate
>>>      ------------------------+-------------+-------------+---------------
>>>       05/04/14     11.61 GiB |  151.50 GiB |  163.11 GiB |   15.84 Mbit/s
>>>       05/05/14     18.37 GiB |  167.82 GiB |  186.18 GiB |   18.08 Mbit/s
>>>       05/06/14     12.54 GiB |  176.53 GiB |  189.07 GiB |   18.36 Mbit/s
>>>       05/07/14     13.07 GiB |  169.65 GiB |  182.73 GiB |   17.74 Mbit/s
>>>       05/08/14     12.49 GiB |  152.46 GiB |  164.95 GiB |   16.01 Mbit/s
>>>       05/09/14     13.54 GiB |  167.91 GiB |  181.45 GiB |   17.62 Mbit/s
>>>       05/10/14     10.72 GiB |  149.50 GiB |  160.22 GiB |   15.56 Mbit/s
>>>       05/11/14     15.25 GiB |  141.75 GiB |  157.01 GiB |   15.24 Mbit/s
>>>       05/12/14     16.06 GiB |  168.05 GiB |  184.11 GiB |   17.88 Mbit/s
>>>       05/13/14     13.25 GiB |  144.43 GiB |  157.68 GiB |   15.31 Mbit/s
>>>       05/14/14     18.24 GiB |  160.37 GiB |  178.61 GiB |   17.34 Mbit/s
>>>       05/15/14     14.30 GiB |  154.11 GiB |  168.41 GiB |   16.35 Mbit/s
>>>       05/16/14     12.99 GiB |  153.21 GiB |  166.20 GiB |   16.14 Mbit/s
>>>       05/17/14      9.87 GiB |  127.64 GiB |  137.51 GiB |   13.35 Mbit/s
>>>       05/18/14     11.43 GiB |  186.48 GiB |  197.91 GiB |   19.22 Mbit/s
>>>       05/19/14     14.08 GiB |  171.26 GiB |  185.35 GiB |   18.00 Mbit/s
>>>       05/20/14     13.47 GiB |  149.67 GiB |  163.14 GiB |   15.84 Mbit/s
>>>       05/21/14     12.67 GiB |  150.21 GiB |  162.89 GiB |   15.81 Mbit/s
>>>       05/22/14     13.43 GiB |  168.17 GiB |  181.61 GiB |   17.63 Mbit/s
>>>       05/23/14     14.06 GiB |  165.47 GiB |  179.53 GiB |   17.43 Mbit/s
>>>       05/24/14      9.75 GiB |  132.61 GiB |  142.36 GiB |   13.82 Mbit/s
>>>       05/25/14     10.10 GiB |  131.50 GiB |  141.60 GiB |   13.75 Mbit/s
>>>       05/26/14     13.81 GiB |  180.48 GiB |  194.28 GiB |   18.86 Mbit/s
>>>       05/27/14     14.51 GiB |  187.94 GiB |  202.45 GiB |   19.66 Mbit/s
>>>       05/28/14     12.76 GiB |  163.84 GiB |  176.60 GiB |   17.15 Mbit/s
>>>       05/29/14     11.95 GiB |  157.68 GiB |  169.63 GiB |   16.47 Mbit/s
>>>       05/30/14     11.62 GiB |  142.46 GiB |  154.08 GiB |   14.96 Mbit/s
>>>       05/31/14      9.97 GiB |  136.38 GiB |  146.35 GiB |   14.21 Mbit/s
>>>       06/01/14     11.49 GiB |  142.08 GiB |  153.56 GiB |   14.91 Mbit/s
>>>       06/02/14      8.62 GiB |   99.75 GiB |  108.37 GiB |   18.07 Mbit/s
>>>      ------------------------+-------------+-------------+---------------
>>>      estimated     14.82 GiB |  171.41 GiB |  186.22 GiB |
>>
>> I'm trying to get vnstat print out more daily data going back to April
>> and March, but I notice that the infra rehaul was April 29-May 2, so I'm
>> suspecting we'd changed something during that period to add more load.
>> In particular, I remember my shrinking disk consumption on OSUOSL
>> mirrors by removing old releases. I wonder if this somehow resulted in
>> the traffic increase.
>>
>>
>>
>
>


-- 
Kohsuke Kawaguchi | CloudBees, Inc. | http://cloudbees.com/
Try Jenkins Enterprise, our professional version of Jenkins


More information about the Jenkins-infra mailing list