[Jenkins-infra] Cucumber bandwidth situation

Kohsuke Kawaguchi kkawaguchi at cloudbees.com
Tue Jun 3 00:13:32 UTC 2014


BTW I filed https://issues.jenkins-ci.org/browse/INFRA-83 to give this
incident a name.


2014-06-02 17:08 GMT-07:00 Andrew Bayer <andrew.bayer at gmail.com>:

> Holy crapnuts. That's cartooonish.
>
> On Mon, Jun 2, 2014 at 4:57 PM, Kohsuke Kawaguchi
> <kkawaguchi at cloudbees.com> wrote:
> > I wonder if there's any way to track who this IP address belongs to.
> Port 22
> > responds with "SSH-2.0-NetScreen", which I assume is a Juniper NetScreen.
> > The IP address claims to be in San Francisco.
> >
> >
> > 2014-06-02 16:52 GMT-07:00 Kohsuke Kawaguchi <kkawaguchi at cloudbees.com>:
> >
> >>
> >> More progress.
> >>
> >> When the last overage happened we started using mod_logio to record
> total
> >> bytes in/out per request [1]. So I looked at which virtual host is
> eating
> >> bandwidth, and the result was quite surprising.
> >>
> >> Looking at the # of bytes transferred outbound per virtual host per
> week,
> >> I get this.
> >>
> >>> ci      489,290,628,016
> >>> main     12,203,528,309
> >>> maven     7,378,839,415
> >>> pkg       8,105,888,873
> >>> svn         937,561,148
> >>> updates  17,900,766,883
> >>
> >>
> >> So ci.jenkins-ci.org has served 490GB/week (!) of data. That's clearly
> too
> >> much.
> >>
> >> So I looked at its access log and noticed that 64.125.71.142 is
> >> downloading
> >>
> /job/jenkins_rc_branch/lastSuccessfulBuild/artifact/war/target/jenkins.war
> >> over and over.
> >>
> >> This file is about 68MB in size, and a download happens every minute.
> >> That's about 95GB/day, or 2.8TB/month if extrapolated.
> >>
> >> THIS GUY SINGLEHANDEDLY CONSUMED MORE THAN HALF THE BANDWIDTH!
> >>
> >>
> >>
> >> The actual download counts per week is as follows:
> >>
> >>     Apr,03  10165
> >>     Apr,10  10171
> >>     Apr,17  7426
> >>     Apr,24  10180
> >>     May,1   11203
> >>     May,8   10226
> >>     May,15  8975
> >>     May,22  10140
> >>
> >> There are occasional gaps (perhaps ci.jenkins-ci.org was down, or
> there's
> >> some pause in what he does), but still clearly this amounts to a
> ridiculous
> >> amount of network transfer.
> >>
> >> I'm going to ban this IP in apache config to stop this immediately.
> >>
> >>
> >> Going forward, we need a better approach to this than waiting for the
> >> overage to slap our face before noticing the problem. For that I think
> we
> >> need to have some monitoring in place to alert us in case of sudden
> traffic
> >> increase.
> >>
> >> Also, as you see in the log above, this didn't start in May, so it
> doesn't
> >> explain the sudden increase in the traffic of May. So we've got still
> more
> >> analysis to do.
> >>
> >>
> >>
> >>
> >>
> >> [1] http://httpd.apache.org/docs/2.2/mod/mod_logio.html
> >>
> >>
> >> On 06/02/2014 03:39 PM, Kohsuke Kawaguchi wrote:
> >>>
> >>>
> >>> I looked at output from haproxy
> >>> http://kohsuke.org/private/20140602/haproxy-stats.png, and this shows
> a
> >>> large amount of activities under "maven", which is
> >>> http://maven.jenkins-ci.org/ that acts as a reverse proxy to
> >>> repo.jenkins-ci.org.
> >>>
> >>> If you look under "bytes out", of 6.8TB that has been served for the
> >>> duration of the haproxy uptime, 1TB is from maven.jenkins-ci.org.
> >>>
> >>> This is surprising because all the download traffic for Maven
> repository
> >>> should be served through http://repo.jenkins-ci.org/
> >>>
> >>> I need to look into this a bit more.
> >>>
> >>>
> >>> OTOH, note that this shows only the cumulative value. I recorded the
> >>> value 2.5 hours later, and the delta for maven was 169MB (extrapolates
> >>> to 50GB/month) whereas delta for overall is 9.6GB (extrapolates to
> >>> 2.7TB/month.) So it's nowhere near big enough to explain the
> >>> 1.45TB/month usage spike in May.
> >>>
> >>> In addition, the traffic to maven.jenkins-ci.org is reverse-proxied to
> >>> repo.jenkins-ci.org, and so if this accounts for the traffic increase,
> >>> it should show up in the corresponding increase in the RX side. There's
> >>> no such spike in the RX side.
> >>>
> >>> I'll continue digging...
> >>>
> >>>
> >>> On 06/02/2014 11:15 AM, Kohsuke Kawaguchi wrote:
> >>>>
> >>>>
> >>>> Tyler got a surprisingly bill for the overage charge for cucumber,
> which
> >>>> runs "jenkins-ci.org", "mirrors.jenkins-ci.org" and a number of other
> >>>> virtual hosts.
> >>>>
> >>>> I think I've set up vnstat when it happened the last time to track
> >>>> utlization. Here is the vnstat output.
> >>>>
> >>>>> kohsuke at cucumber:~$ vnstat -m
> >>>>>
> >>>>>  eth0  /  monthly
> >>>>>
> >>>>>        month        rx      |     tx      |    total    |   avg. rate
> >>>>>
> >>>>> ------------------------+-------------+-------------+---------------
> >>>>>       Jul '13    338.35 GiB |    3.21 TiB |    3.54 TiB |   11.36
> >>>>> Mbit/s
> >>>>>       Aug '13    338.36 GiB |    2.86 TiB |    3.19 TiB |   10.23
> >>>>> Mbit/s
> >>>>>       Sep '13    354.39 GiB |    3.52 TiB |    3.87 TiB |   12.82
> >>>>> Mbit/s
> >>>>>       Oct '13    395.09 GiB |    4.20 TiB |    4.59 TiB |   14.72
> >>>>> Mbit/s
> >>>>>       Nov '13    449.73 GiB |    3.51 TiB |    3.94 TiB |   13.07
> >>>>> Mbit/s
> >>>>>       Dec '13    562.26 GiB |    3.68 TiB |    4.23 TiB |   13.56
> >>>>> Mbit/s
> >>>>>       Jan '14    672.19 GiB |    3.91 TiB |    4.56 TiB |   14.64
> >>>>> Mbit/s
> >>>>>       Feb '14    370.69 GiB |    3.13 TiB |    3.49 TiB |   12.39
> >>>>> Mbit/s
> >>>>>       Mar '14    351.83 GiB |    3.33 TiB |    3.67 TiB |   11.77
> >>>>> Mbit/s
> >>>>>       Apr '14    362.76 GiB |    3.39 TiB |    3.74 TiB |   12.40
> >>>>> Mbit/s
> >>>>>       May '14    401.56 GiB |    4.80 TiB |    5.19 TiB |   16.65
> >>>>> Mbit/s
> >>>>>       Jun '14     20.11 GiB |  241.83 GiB |  261.94 GiB |   16.07
> >>>>> Mbit/s
> >>>>>
> >>>>> ------------------------+-------------+-------------+---------------
> >>>>>     estimated    381.25 GiB |    4.48 TiB |    4.85 TiB |
> >>>>
> >>>>
> >>>> As you see, the outbound traffic jumped in May.
> >>>> Here's the daily output, and I think it means that new trend is
> >>>> continuing in June so far as I can tell.
> >>>>
> >>>> In other words, we need to act on it ASAP to avoid another overage for
> >>>> June.
> >>>>
> >>>>
> >>>>> kohsuke at cucumber:~$ vnstat -d
> >>>>>
> >>>>>  eth0  /  daily
> >>>>>
> >>>>>          day         rx      |     tx      |    total    |   avg.
> rate
> >>>>>
> >>>>> ------------------------+-------------+-------------+---------------
> >>>>>       05/04/14     11.61 GiB |  151.50 GiB |  163.11 GiB |   15.84
> >>>>> Mbit/s
> >>>>>       05/05/14     18.37 GiB |  167.82 GiB |  186.18 GiB |   18.08
> >>>>> Mbit/s
> >>>>>       05/06/14     12.54 GiB |  176.53 GiB |  189.07 GiB |   18.36
> >>>>> Mbit/s
> >>>>>       05/07/14     13.07 GiB |  169.65 GiB |  182.73 GiB |   17.74
> >>>>> Mbit/s
> >>>>>       05/08/14     12.49 GiB |  152.46 GiB |  164.95 GiB |   16.01
> >>>>> Mbit/s
> >>>>>       05/09/14     13.54 GiB |  167.91 GiB |  181.45 GiB |   17.62
> >>>>> Mbit/s
> >>>>>       05/10/14     10.72 GiB |  149.50 GiB |  160.22 GiB |   15.56
> >>>>> Mbit/s
> >>>>>       05/11/14     15.25 GiB |  141.75 GiB |  157.01 GiB |   15.24
> >>>>> Mbit/s
> >>>>>       05/12/14     16.06 GiB |  168.05 GiB |  184.11 GiB |   17.88
> >>>>> Mbit/s
> >>>>>       05/13/14     13.25 GiB |  144.43 GiB |  157.68 GiB |   15.31
> >>>>> Mbit/s
> >>>>>       05/14/14     18.24 GiB |  160.37 GiB |  178.61 GiB |   17.34
> >>>>> Mbit/s
> >>>>>       05/15/14     14.30 GiB |  154.11 GiB |  168.41 GiB |   16.35
> >>>>> Mbit/s
> >>>>>       05/16/14     12.99 GiB |  153.21 GiB |  166.20 GiB |   16.14
> >>>>> Mbit/s
> >>>>>       05/17/14      9.87 GiB |  127.64 GiB |  137.51 GiB |   13.35
> >>>>> Mbit/s
> >>>>>       05/18/14     11.43 GiB |  186.48 GiB |  197.91 GiB |   19.22
> >>>>> Mbit/s
> >>>>>       05/19/14     14.08 GiB |  171.26 GiB |  185.35 GiB |   18.00
> >>>>> Mbit/s
> >>>>>       05/20/14     13.47 GiB |  149.67 GiB |  163.14 GiB |   15.84
> >>>>> Mbit/s
> >>>>>       05/21/14     12.67 GiB |  150.21 GiB |  162.89 GiB |   15.81
> >>>>> Mbit/s
> >>>>>       05/22/14     13.43 GiB |  168.17 GiB |  181.61 GiB |   17.63
> >>>>> Mbit/s
> >>>>>       05/23/14     14.06 GiB |  165.47 GiB |  179.53 GiB |   17.43
> >>>>> Mbit/s
> >>>>>       05/24/14      9.75 GiB |  132.61 GiB |  142.36 GiB |   13.82
> >>>>> Mbit/s
> >>>>>       05/25/14     10.10 GiB |  131.50 GiB |  141.60 GiB |   13.75
> >>>>> Mbit/s
> >>>>>       05/26/14     13.81 GiB |  180.48 GiB |  194.28 GiB |   18.86
> >>>>> Mbit/s
> >>>>>       05/27/14     14.51 GiB |  187.94 GiB |  202.45 GiB |   19.66
> >>>>> Mbit/s
> >>>>>       05/28/14     12.76 GiB |  163.84 GiB |  176.60 GiB |   17.15
> >>>>> Mbit/s
> >>>>>       05/29/14     11.95 GiB |  157.68 GiB |  169.63 GiB |   16.47
> >>>>> Mbit/s
> >>>>>       05/30/14     11.62 GiB |  142.46 GiB |  154.08 GiB |   14.96
> >>>>> Mbit/s
> >>>>>       05/31/14      9.97 GiB |  136.38 GiB |  146.35 GiB |   14.21
> >>>>> Mbit/s
> >>>>>       06/01/14     11.49 GiB |  142.08 GiB |  153.56 GiB |   14.91
> >>>>> Mbit/s
> >>>>>       06/02/14      8.62 GiB |   99.75 GiB |  108.37 GiB |   18.07
> >>>>> Mbit/s
> >>>>>
> >>>>> ------------------------+-------------+-------------+---------------
> >>>>>      estimated     14.82 GiB |  171.41 GiB |  186.22 GiB |
> >>>>
> >>>>
> >>>> I'm trying to get vnstat print out more daily data going back to April
> >>>> and March, but I notice that the infra rehaul was April 29-May 2, so
> I'm
> >>>> suspecting we'd changed something during that period to add more load.
> >>>> In particular, I remember my shrinking disk consumption on OSUOSL
> >>>> mirrors by removing old releases. I wonder if this somehow resulted in
> >>>> the traffic increase.
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Kohsuke Kawaguchi | CloudBees, Inc. | http://cloudbees.com/
> >> Try Jenkins Enterprise, our professional version of Jenkins
> >
> >
> >
> >
> > --
> > Kohsuke Kawaguchi
> >
> > _______________________________________________
> > Jenkins-infra mailing list
> > Jenkins-infra at lists.jenkins-ci.org
> > http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
> >
>



-- 
Kohsuke Kawaguchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20140602/5c0249b6/attachment-0001.html>


More information about the Jenkins-infra mailing list