[Jenkins-infra] (fwd) Re: Increasing rate-limit for autnenticated activity on busy/large (jenkinsci) org
R. Tyler Croy
tyler at monkeypox.org
Wed Oct 5 15:46:44 UTC 2016
----- Forwarded message from Ivan ??u??ak <support at github.com> -----
Date: Wed, 05 Oct 2016 04:49:20 -0700
From: Ivan ??u??ak <support at github.com>
To: "R. Tyler Croy" <tyler at monkeypox.org>
Subject: Re: Increasing rate-limit for autnenticated activity on busy/large (jenkinsci) org
Message-ID: <discussions/07023a048a5a11e69316559b2eb2b6a1/comments/3256985 at github.com>
Hello there, Tyler,
Thanks for reaching out. API rate limit increases are something we offer rarely, especially permanent ones. The current limits help us keep the API fast and reliable for all our users and applications, not just one. For the same reason, hitting the API rate limit shouldn't be considered abnormal -- it's a core part of using the API and the API returns information about your remaining quota and when it will refresh with every response. In most case, we try to provide some advice on optimizing API usage so that you don't even need a rate limit increase.
I've just had a look at our API traffic logs for that account you're using, and I wanted to share some stats with you.
Here are the top 10 API endpoints you're hitting (over the past week):
nil 204,138 20.349% (these are requests that were rate limited)
/rate_limit 177,343 17.678%
/repositories/:repository_id/commits/* 176,145 17.558%
/repositories/:repository_id/branches 105,497 10.516%
/repositories/:repository_id/events 99,464 9.915%
/repositories/:repository_id/collaborators 47,637 4.748%
/repositories/:repository_id/pulls/:id/comments 37,171 3.705%
/repositories/:repository_id/contents/?* 30,111 3.002%
/repositories/:repository_id/pulls 29,056 2.896%
/repositories/:repository_id 18,348 1.829%
And the top 10 resources:
/rate_limit 177,343 17.678%
/repositories/1103607/collaborators 20,690 2.062%
/repos/jenkinsci/jenkins/contents/ 15,998 1.595%
/repos/jenkinsci/jenkins/collaborators 10,344 1.031%
/user 7,695 0.767%
/ 7,562 0.754%
/repositories/612587/collaborators 7,130 0.711%
/repos/jenkins-infra/jenkins.io 6,642 0.662%
/repos/jenkins-inc/securitay/contents/ 5,254 0.524%
/repos/jenkins-inc/borat/contents/ 4,007 0.399%
Top HTTP request methods and response statuses:
get 996,922 99.376%
post 6,220 0.62%
put 41 0.004%
delete 2 0%
200 792,833 79.032%
403 204,038 20.339%
201 3,861 0.385%
422 2,319 0.231%
404 69 0.007%
301 42 0.004%
204 13 0.001%
202 10 0.001%
And the tokens you're using to make those requests:
52986225 852,452 84.974% Jenkins JIRA (OAuth application)
44790939 92,556 9.226% ci.jenkins.io pipeline token (personal token)
28723209 47,502 4.735% demo.jenkins-ci.org (personal token)
There are a few things here you might consider optimizing. First, you're making a lot of requests over the API rate limit, which is generally considered abuse of the API:
https://developer.github.com/guides/best-practices-for-integrators/#dealing-with-rate-limits
Can you add some checks so that you don't make requests after you've hit the rate limit? Some of those requests are API calls to check your rate limit, but I don't see a reason why you'd need to call that endpoint so frequently either -- you can call it once to see when the rate limit will refresh and sleep until that moment.
Next, it seems that you are fetching and-refetching the same resources over and over again in short periods of time. For example, here are some stats for a 1-hour period when you hit the rate limits:
/repos/jenkinsci/elasticbox-plugin/pulls 62 1.243%
/repos/jenkinsci/jenkins/pulls 49 0.982%
/repositories/1103607/pulls 49 0.982%
/repos/jenkinsci/ec2-plugin/pulls 48 0.962%
/repos/jenkinsci/instance-identity-module/pulls 39 0.782%
/repositories/1169210/pulls 36 0.722%
/repos/jenkinsci/jacoco-plugin/pulls 30 0.601%
/repos/jenkinsci/jenkins/pulls/2570/comments 21 0.421%
/repos/jenkinsci/instant-messaging-plugin/pulls 19 0.381%
/user 19 0.381%
All of those requests were actually made within a 10-minute period, which was enough to drain your quota completely. For example, you fetched the list of comments for a single pull request 21 times within those 10 minutes (/repos/jenkinsci/jenkins/pulls/2570/comments), and as far as I can tell -- there's only one page of comments there (you weren't fetching multiple pages). The same is true for fetching the list of pull requests for some repositories -- you fetched /repos/jenkinsci/elasticbox-plugin/pulls 62 times within a single minute (again, only the first page). When you aggregate such behavior over many pull requests and repositories -- it drains your quota really fast.
I recommend you look into caching some of that data on your end (the API will reward you for that: https://developer.github.com/v3/#conditional-requests), or even better -- using webhook so that you fetch data once after it's modified (and store it on your end) instead of every time you need it.
I'm guessing that with some optimizations, you might not even need a rate limit increase. If you notice that you do even after optimizing your usage -- let us know and we'd be happy to take a look at our logs again and discuss a possible increase with the team.
I hope these notes are helpful, and let me know if you need any other information from our end.
Best,
Ivan
> Hello, I manage the infrastructure for the Jenkins project (https://jenkins.io)
> and we're using some GitHub integrations to provide build/test/release services
> for developers who participate in our GitHub organization on
> https://ci.jenkins.io
>
> Unfortunately, despite being authenticated (via the jenkinsadmin account) we
> are regularly hitting the rate-limit on API calls.
>
> I was hoping we could get the rate-limit raised for calls? I'm not sure what is
> reasonable, but 5000 per hour does seem rather low for the number of commit
> statuses and repo lookups our Jenkins installation needs to perform.
>
> Please let me know if this is doable, and if any more information is necessary
> from our side.
>
> Cheers
> - R. Tyler Croy
----- End forwarded message -----
- R. Tyler Croy
------------------------------------------------------
Code: <https://github.com/rtyler>
Chatter: <https://twitter.com/agentdero>
% gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20161005/02da3035/attachment.asc>
More information about the Jenkins-infra
mailing list