[Jenkins-infra] (fwd) Re: Increasing rate-limit for autnenticated activity on busy/large (jenkinsci) org

R. Tyler Croy tyler at monkeypox.org
Wed Oct 5 15:46:44 UTC 2016


----- Forwarded message from Ivan ??u??ak <support at github.com> -----

Date: Wed, 05 Oct 2016 04:49:20 -0700
From: Ivan ??u??ak <support at github.com>
To: "R. Tyler Croy" <tyler at monkeypox.org>
Subject: Re: Increasing rate-limit for autnenticated activity on busy/large (jenkinsci) org
Message-ID: <discussions/07023a048a5a11e69316559b2eb2b6a1/comments/3256985 at github.com>

Hello there, Tyler,

Thanks for reaching out. API rate limit increases are something we offer rarely, especially permanent ones. The current limits help us keep the API fast and reliable for all our users and applications, not just one. For the same reason, hitting the API rate limit shouldn't be considered abnormal -- it's a core part of using the API and the API returns information about your remaining quota and when it will refresh with every response. In most case, we try to provide some advice on optimizing API usage so that you don't even need a rate limit increase. 

I've just had a look at our API traffic logs for that account you're using, and I wanted to share some stats with you. 

Here are the top 10 API endpoints you're hitting (over the past week):

nil	204,138	20.349%	(these are requests that were rate limited)
/rate_limit	177,343	17.678%	
/repositories/:repository_id/commits/*	176,145	17.558%	
/repositories/:repository_id/branches	105,497	10.516%	
/repositories/:repository_id/events	99,464	9.915%	
/repositories/:repository_id/collaborators	47,637	4.748%	
/repositories/:repository_id/pulls/:id/comments	37,171	3.705%	
/repositories/:repository_id/contents/?*	30,111	3.002%	
/repositories/:repository_id/pulls	29,056	2.896%	
/repositories/:repository_id	18,348	1.829%	

And the top 10 resources:

/rate_limit	177,343	17.678%	
/repositories/1103607/collaborators	20,690	2.062%	
/repos/jenkinsci/jenkins/contents/	15,998	1.595%	
/repos/jenkinsci/jenkins/collaborators	10,344	1.031%	
/user	7,695	0.767%	
/	7,562	0.754%	
/repositories/612587/collaborators	7,130	0.711%	
/repos/jenkins-infra/jenkins.io	6,642	0.662%	
/repos/jenkins-inc/securitay/contents/	5,254	0.524%	
/repos/jenkins-inc/borat/contents/	4,007	0.399%

Top HTTP request methods and response statuses:

get	996,922	99.376%	
post	6,220	0.62%	
put	41	0.004%	
delete	2	0%

200	792,833	79.032%	
403	204,038	20.339%	
201	3,861	0.385%	
422	2,319	0.231%	
404	69	0.007%	
301	42	0.004%	
204	13	0.001%	
202	10	0.001%

And the tokens you're using to make those requests:

52986225	852,452	84.974%	Jenkins JIRA (OAuth application)
44790939	92,556	9.226%	ci.jenkins.io pipeline token (personal token)
28723209	47,502	4.735%	demo.jenkins-ci.org (personal token)

There are a few things here you might consider optimizing. First, you're making a lot of requests over the API rate limit, which is generally considered abuse of the API:

https://developer.github.com/guides/best-practices-for-integrators/#dealing-with-rate-limits

Can you add some checks so that you don't make requests after you've hit the rate limit? Some of those requests are API calls to check your rate limit, but I don't see a reason why you'd need to call that endpoint so frequently either -- you can call it once to see when the rate limit will refresh and sleep until that moment. 

Next, it seems that you are fetching and-refetching the same resources over and over again in short periods of time. For example, here are some stats for a 1-hour period when you hit the rate limits:

/repos/jenkinsci/elasticbox-plugin/pulls	62	1.243%	
/repos/jenkinsci/jenkins/pulls	49	0.982%	
/repositories/1103607/pulls	49	0.982%	
/repos/jenkinsci/ec2-plugin/pulls	48	0.962%	
/repos/jenkinsci/instance-identity-module/pulls	39	0.782%	
/repositories/1169210/pulls	36	0.722%	
/repos/jenkinsci/jacoco-plugin/pulls	30	0.601%	
/repos/jenkinsci/jenkins/pulls/2570/comments	21	0.421%	
/repos/jenkinsci/instant-messaging-plugin/pulls	19	0.381%	
/user	19	0.381%

All of those requests were actually made within a 10-minute period, which was enough to drain your quota completely. For example, you fetched the list of comments for a single pull request 21 times within those 10 minutes (/repos/jenkinsci/jenkins/pulls/2570/comments), and as far as I can tell -- there's only one page of comments there (you weren't fetching multiple pages). The same is true for fetching the list of pull requests for some repositories -- you fetched /repos/jenkinsci/elasticbox-plugin/pulls 62 times within a single minute (again, only the first page). When you aggregate such behavior over many pull requests and repositories -- it drains your quota really fast.

I recommend you look into caching some of that data on your end (the API will reward you for that: https://developer.github.com/v3/#conditional-requests), or even better -- using webhook so that you fetch data once after it's modified (and store it on your end) instead of every time you need it. 

I'm guessing that with some optimizations, you might not even need a rate limit increase. If you notice that you do even after optimizing your usage -- let us know and we'd be happy to take a look at our logs again and discuss a possible increase with the team.

I hope these notes are helpful, and let me know if you need any other information from our end.

Best,
Ivan

> Hello, I manage the infrastructure for the Jenkins project (https://jenkins.io)
> and we're using some GitHub integrations to provide build/test/release services
> for developers who participate in our GitHub organization on
> https://ci.jenkins.io
> 
> Unfortunately, despite being authenticated (via the jenkinsadmin account) we
> are regularly hitting the rate-limit on API calls.
> 
> I was hoping we could get the rate-limit raised for calls? I'm not sure what is
> reasonable, but 5000 per hour does seem rather low for the number of commit
> statuses and repo lookups our Jenkins installation needs to perform.
> 
> Please let me know if this is doable, and if any more information is necessary
> from our side.
> 
> Cheers
> - R. Tyler Croy

----- End forwarded message -----

- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20161005/02da3035/attachment.asc>


More information about the Jenkins-infra mailing list