[Jenkins-infra] (fwd) Re: Increasing rate-limit for autnenticated activity on busy/large (jenkinsci) org

Andrew Bayer andrew.bayer at gmail.com
Wed Oct 5 16:06:51 UTC 2016


So basically, we're abusing the GH API. Nice. =)

A.

On Wed, Oct 5, 2016 at 5:46 PM, R. Tyler Croy <tyler at monkeypox.org> wrote:

> ----- Forwarded message from Ivan ??u??ak <support at github.com> -----
>
> Date: Wed, 05 Oct 2016 04:49:20 -0700
> From: Ivan ??u??ak <support at github.com>
> To: "R. Tyler Croy" <tyler at monkeypox.org>
> Subject: Re: Increasing rate-limit for autnenticated activity on
> busy/large (jenkinsci) org
> Message-ID: <discussions/07023a048a5a11e69316559b2eb2b6a1/comments/
> 3256985 at github.com>
>
> Hello there, Tyler,
>
> Thanks for reaching out. API rate limit increases are something we offer
> rarely, especially permanent ones. The current limits help us keep the API
> fast and reliable for all our users and applications, not just one. For the
> same reason, hitting the API rate limit shouldn't be considered abnormal --
> it's a core part of using the API and the API returns information about
> your remaining quota and when it will refresh with every response. In most
> case, we try to provide some advice on optimizing API usage so that you
> don't even need a rate limit increase.
>
> I've just had a look at our API traffic logs for that account you're
> using, and I wanted to share some stats with you.
>
> Here are the top 10 API endpoints you're hitting (over the past week):
>
> nil     204,138 20.349% (these are requests that were rate limited)
> /rate_limit     177,343 17.678%
> /repositories/:repository_id/commits/*  176,145 17.558%
> /repositories/:repository_id/branches   105,497 10.516%
> /repositories/:repository_id/events     99,464  9.915%
> /repositories/:repository_id/collaborators      47,637  4.748%
> /repositories/:repository_id/pulls/:id/comments 37,171  3.705%
> /repositories/:repository_id/contents/?*        30,111  3.002%
> /repositories/:repository_id/pulls      29,056  2.896%
> /repositories/:repository_id    18,348  1.829%
>
> And the top 10 resources:
>
> /rate_limit     177,343 17.678%
> /repositories/1103607/collaborators     20,690  2.062%
> /repos/jenkinsci/jenkins/contents/      15,998  1.595%
> /repos/jenkinsci/jenkins/collaborators  10,344  1.031%
> /user   7,695   0.767%
> /       7,562   0.754%
> /repositories/612587/collaborators      7,130   0.711%
> /repos/jenkins-infra/jenkins.io 6,642   0.662%
> /repos/jenkins-inc/securitay/contents/  5,254   0.524%
> /repos/jenkins-inc/borat/contents/      4,007   0.399%
>
> Top HTTP request methods and response statuses:
>
> get     996,922 99.376%
> post    6,220   0.62%
> put     41      0.004%
> delete  2       0%
>
> 200     792,833 79.032%
> 403     204,038 20.339%
> 201     3,861   0.385%
> 422     2,319   0.231%
> 404     69      0.007%
> 301     42      0.004%
> 204     13      0.001%
> 202     10      0.001%
>
> And the tokens you're using to make those requests:
>
> 52986225        852,452 84.974% Jenkins JIRA (OAuth application)
> 44790939        92,556  9.226%  ci.jenkins.io pipeline token (personal
> token)
> 28723209        47,502  4.735%  demo.jenkins-ci.org (personal token)
>
> There are a few things here you might consider optimizing. First, you're
> making a lot of requests over the API rate limit, which is generally
> considered abuse of the API:
>
> https://developer.github.com/guides/best-practices-for-
> integrators/#dealing-with-rate-limits
>
> Can you add some checks so that you don't make requests after you've hit
> the rate limit? Some of those requests are API calls to check your rate
> limit, but I don't see a reason why you'd need to call that endpoint so
> frequently either -- you can call it once to see when the rate limit will
> refresh and sleep until that moment.
>
> Next, it seems that you are fetching and-refetching the same resources
> over and over again in short periods of time. For example, here are some
> stats for a 1-hour period when you hit the rate limits:
>
> /repos/jenkinsci/elasticbox-plugin/pulls        62      1.243%
> /repos/jenkinsci/jenkins/pulls  49      0.982%
> /repositories/1103607/pulls     49      0.982%
> /repos/jenkinsci/ec2-plugin/pulls       48      0.962%
> /repos/jenkinsci/instance-identity-module/pulls 39      0.782%
> /repositories/1169210/pulls     36      0.722%
> /repos/jenkinsci/jacoco-plugin/pulls    30      0.601%
> /repos/jenkinsci/jenkins/pulls/2570/comments    21      0.421%
> /repos/jenkinsci/instant-messaging-plugin/pulls 19      0.381%
> /user   19      0.381%
>
> All of those requests were actually made within a 10-minute period, which
> was enough to drain your quota completely. For example, you fetched the
> list of comments for a single pull request 21 times within those 10 minutes
> (/repos/jenkinsci/jenkins/pulls/2570/comments), and as far as I can tell
> -- there's only one page of comments there (you weren't fetching multiple
> pages). The same is true for fetching the list of pull requests for some
> repositories -- you fetched /repos/jenkinsci/elasticbox-plugin/pulls 62
> times within a single minute (again, only the first page). When you
> aggregate such behavior over many pull requests and repositories -- it
> drains your quota really fast.
>
> I recommend you look into caching some of that data on your end (the API
> will reward you for that: https://developer.github.com/
> v3/#conditional-requests), or even better -- using webhook so that you
> fetch data once after it's modified (and store it on your end) instead of
> every time you need it.
>
> I'm guessing that with some optimizations, you might not even need a rate
> limit increase. If you notice that you do even after optimizing your usage
> -- let us know and we'd be happy to take a look at our logs again and
> discuss a possible increase with the team.
>
> I hope these notes are helpful, and let me know if you need any other
> information from our end.
>
> Best,
> Ivan
>
> > Hello, I manage the infrastructure for the Jenkins project (
> https://jenkins.io)
> > and we're using some GitHub integrations to provide build/test/release
> services
> > for developers who participate in our GitHub organization on
> > https://ci.jenkins.io
> >
> > Unfortunately, despite being authenticated (via the jenkinsadmin
> account) we
> > are regularly hitting the rate-limit on API calls.
> >
> > I was hoping we could get the rate-limit raised for calls? I'm not sure
> what is
> > reasonable, but 5000 per hour does seem rather low for the number of
> commit
> > statuses and repo lookups our Jenkins installation needs to perform.
> >
> > Please let me know if this is doable, and if any more information is
> necessary
> > from our side.
> >
> > Cheers
> > - R. Tyler Croy
>
> ----- End forwarded message -----
>
> - R. Tyler Croy
>
> ------------------------------------------------------
>      Code: <https://github.com/rtyler>
>   Chatter: <https://twitter.com/agentdero>
>
>   % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
> ------------------------------------------------------
>
> _______________________________________________
> Jenkins-infra mailing list
> Jenkins-infra at lists.jenkins-ci.org
> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20161005/7a926e4e/attachment-0001.html>


More information about the Jenkins-infra mailing list