[Jenkins-infra] Backend considerations for the Telemetry API

Fri Aug 31 08:48:42 UTC 2018

Hey, 

Indeed it sounds like a simple application, we just need to agree how we store the state, file vs database.
Personnaly it doesn't matter, it's just a different terraform resource.
We just have to know how we'll display those data (ELK?).

  "HTTP endpoint" -> "Application logic in Container" -> "State"

Regarding Azure Function I am not feeling confortable to start allowing developpers to access the Azure account today, we have a lot of critical information/services their at the moment and we never took the time to really define access policies and if we do, we'll have to regularly review them.
And I don't have the time a the moment to do that neither be the proxy with the azure account especially when I see how often Tyler had to deal with the Azure functions support.

If we want to use a serverless solution, I would prefer deploying something on top of a Kubernetes cluster with limited access to some namespaces.
The main advantage that I see with this approach is that I can easily deploy a framework with Helm and then let developers be autonomous with the applications they are maintaining.
Fission sounds interesting for that but I don't have much experience with serverless frameworks.
https://fission.io/

Or we  could 'just' deploy another application like we do at the moment with kubernetes, build self contained docker container, configure needed services (postgres,...).
The advantages of this approach are 
  * Well knowned workflow
  * Easy to replicate, docker-compose for local environment
  * We can also provide read-only access to namespaces

> I believe I _could_ implement GitHub-based OAuth authorization for this in an afternoon
I don't understand why you need that? 
Isn't there existing solution that we could reuse?

Personally I am more interested to see serverless frameworks in action :) , but I think it's better to let the person in charge of maintaining that application to take the final decision.

To make it short
Azure Function
  -> It requires Azure access, and I have the feeling that it doesn't scale
Serverless Framework
  -> First need to be deployed, and I probably won't have the time to that before JW
  -> It sounds the funniest scenario to me
"Classic" application
  -> We already have a lot of example and I can redirect to the best existing example and once you have a container to display it's easy for me to deploy it 
  -> It sounds the easiest scenario to me

---
-> gpg --keyserver keys.gnupg.net --recv-key 52210D3D
---

On Fri, Aug 31, 2018, at 1:16 AM, R. Tyler Croy wrote:
> Reading through Daniel's proposal for a Telemetry API
> (https://github.com/jenkinsci/jep/tree/master/jep/214),
> I wanted to share some thoughts on how we might support or implement this in
> the current state of Jenkins infrastructure.
> 
> Based on the description laid forth by Daniel, what is needed is a relatively
> low-effort append-only data store, with a simple HTTP endpoint in front of it.
> 
> An option originally suggested by Daniel would be to utilize Azure 
> Functions.
> Currently, I'm not the most thrilled with the performance and 
> maintainability
> of Azure Functions, but if we were to go that route, using the Table 
> Storage
> output binding (seeL 
> https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-table#input---javascript-example)
> to append rows to an Azure Table Storage
> (https://docs.microsoft.com/en-us/azure/storage/tables/table-storage-overview)
> instance does _appear_ to be relatively straight-forward.
> 
>     Pros:
>      * Allegedly quite simple to implement
>      * Allegedly scalable
>     Cons:
>      * When things fail, it is incredibly difficult to notice, and hell to
>        debug.
>      * Access control for developers to specific parts of Azure Table Storage
>        would need to be added into Azure Active Directory. Otherwise we would
>        need a dedicated person with access to export data for individuals.
> 
> 
> Perhaps my preferred option, which is more work, but a little more maintainable
> would be to implement a small web application deployed in our Kubernetes
> environment, which sits in front of a PostgreSQL database. That database could
> be very simple, treating PostgreSQL as a simple key-value store with columns
> for: telemetry_type, json_data, timestamp.  I think it could be generally
> useful in the future to have this "utility" PostgreSQL database shared across a
> number of different services which need some little bits of storage, like this
> Telemetry API. For export, this approach may suffer from the same problem as the
> Azure Functions-based approach however. I believe I _could_ implement
> GitHub-based OAuth authorization for this in an afternoon if I were to be using
> the same type of stack which Jenkins Evergreen uses on the backend however.
> 
>     Pros:
>      * More control over the runtime environment
>      * As scalable as our Kubernetes environment
>     Cons:
>      * More costly, an Azure PostgreSQL database would need to be provisioned
>      * Data export is still a challenge and must be implemented
> 
> 
> 
> I wanted to write these thoughts up to start the discussion with Olivier to see
> what he thinks as well, as perhaps there is a different appraoch he might have
> in mind.
> 
> I'm partial to the latter approach, and in exchange for some beers or Daniel
> taking some tasks off my plate, I might be able to whip something up in a day
> if I were uninterrupted ;)
> 
> 
> Whatcha think?
> 
> 
> _______________________________________________
> Jenkins-infra mailing list
> Jenkins-infra at lists.jenkins-ci.org
> http://lists.jenkins-ci.org/mailman/listinfo/jenkins-infra
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)