[Jenkins-infra] Backend considerations for the Telemetry API

R. Tyler Croy tyler at monkeypox.org
Thu Aug 30 23:16:46 UTC 2018


Reading through Daniel's proposal for a Telemetry API
(https://github.com/jenkinsci/jep/tree/master/jep/214),
I wanted to share some thoughts on how we might support or implement this in
the current state of Jenkins infrastructure.

Based on the description laid forth by Daniel, what is needed is a relatively
low-effort append-only data store, with a simple HTTP endpoint in front of it.

An option originally suggested by Daniel would be to utilize Azure Functions.
Currently, I'm not the most thrilled with the performance and maintainability
of Azure Functions, but if we were to go that route, using the Table Storage
output binding (seeL https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-table#input---javascript-example)
to append rows to an Azure Table Storage
(https://docs.microsoft.com/en-us/azure/storage/tables/table-storage-overview)
instance does _appear_ to be relatively straight-forward.

    Pros:
     * Allegedly quite simple to implement
     * Allegedly scalable
    Cons:
     * When things fail, it is incredibly difficult to notice, and hell to
       debug.
     * Access control for developers to specific parts of Azure Table Storage
       would need to be added into Azure Active Directory. Otherwise we would
       need a dedicated person with access to export data for individuals.


Perhaps my preferred option, which is more work, but a little more maintainable
would be to implement a small web application deployed in our Kubernetes
environment, which sits in front of a PostgreSQL database. That database could
be very simple, treating PostgreSQL as a simple key-value store with columns
for: telemetry_type, json_data, timestamp.  I think it could be generally
useful in the future to have this "utility" PostgreSQL database shared across a
number of different services which need some little bits of storage, like this
Telemetry API. For export, this approach may suffer from the same problem as the
Azure Functions-based approach however. I believe I _could_ implement
GitHub-based OAuth authorization for this in an afternoon if I were to be using
the same type of stack which Jenkins Evergreen uses on the backend however.

    Pros:
     * More control over the runtime environment
     * As scalable as our Kubernetes environment
    Cons:
     * More costly, an Azure PostgreSQL database would need to be provisioned
     * Data export is still a challenge and must be implemented



I wanted to write these thoughts up to start the discussion with Olivier to see
what he thinks as well, as perhaps there is a different appraoch he might have
in mind.

I'm partial to the latter approach, and in exchange for some beers or Daniel
taking some tasks off my plate, I might be able to whip something up in a day
if I were uninterrupted ;)


Whatcha think?


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20180830/a3342541/attachment.asc>


More information about the Jenkins-infra mailing list