Introduction
Goal
The goal of this project:
-
collect telemetry data(metrics, traces, logs) of remoting module with OpenTelemetry.
-
send the telemetry data to OpenTelemetry Protocol endpoint
Which OpenTelemetry endpoint to use and how to visualize the data are up to users. Collect telemetry data of Jenkins Remoting using OpenTelemetry.
OpenTelemetry
An observability framework for cloud-native software
OpenTelemetry is a collection of tools, APIs, and SDKs. You can use it to instrument, generate, collect, and export telemetry data(metrics, logs, and traces) for analysis in order to understand your software’s performance and behavior.
Quick Demo
Using Docker compose
Clone our repository, and then,
$ cd example
$ docker-compose up # it may take few minutes
This will set up
-
Jenkins controller
-
preconfigured with JCasC
-
-
Jenkins inbound agents
-
instrumented with our monitoring engine
-
-
OpenTelemetry Collector
-
Loki for Log aggregation
-
Prometheus for metric backend
-
Grafana for log and metric visualization
-
datasource is already configured
-
Open Grafana: http://localhost:3000/explore
You can see agents' log in Loki datasource and agents' metrics in Prometheus datasource.
Getting started with Inbound Agent
1. Install Remoting monitoring with OpenTelemetry Plugin
Please install Remoting monitoring with OpenTelemetry Plugin into your Jenkins controller.
If you want, you can set up Jenkins controller with this plugin installed using Docker Compose. Please the next section for details.
Plugin page: https://plugins.jenkins.io/remoting-opentelemetry
2. Setup OpenTelemetry protocol endpoint and monitoring backends
We prepare docker-compose.yaml to set up them. Use it if you just want to try.
Clone our repository, and then
$ cd example
$ docker-compose up otel_collector loki prometheus grafana jenkins_blueocean
# or if you use your own Jenkins controller,
$ docker-compose up otel_collector loki prometheus grafana
This will set up
-
OpenTelemetry Collector
-
Loki for Log aggregation
-
Prometheus for metric backend
-
Grafana for log and metric visualization
-
datasource is already configured
-
-
Jenkins Controller
-
Remoting monitoring with OpenTelemetry Plugin is preinstalled.
-
3. Download monitoring-engine
Download remoting-opentelemetry-engine.jar
from Jenkins maven repository.
$ curl "https://repo.jenkins-ci.org/artifactory/releases/io/jenkins/plugins/remoting-opentelemetry-engine/[RELEASE]/remoting-opentelemetry-engine-[RELEASE].jar" -o remoting-opentelemetry-engine.jar
We will use this JAR as java agent when launching agent.
4. Create logging.properties file.
Use io.jenkins.plugins.remotingopentelemetry.engine.log.OpenTelemetryLogHandler
for handler.
handlers=io.jenkins.plugins.remotingopentelemetry.engine.log.OpenTelemetryLogHandler,java.util.logging.ConsoleHandler
.level=INFO
5. Launch Jenkins agent
Setup jenkins controller and launch agent with -javaagent
and -loggingConfig
option.
$ export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:55680
$ java \
-javaagent:remoting-opentelemetry-engine.jar \
-jar agent.jar \
-jnlpUrl <jnlp url> \
-loggingConfig logging.properties
6. Explore logs and metrics
Open Grafana: http://localhost:3000/explore
Configuration options
We can configure the monitoring engine via environment variables.
environment variable | require | example / description |
---|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT |
true |
|
Target to which the exporter is going to send spans, metrics or logs. |
||
SERVICE_INSTANCE_ID |
false |
90caeb02-a5ba-4827-bb3e-63babecfa893 |
The string ID of the service instance. If not provided, UUID will be generated every time the agent launches. Note: If you don’t set this value, the service instance id will be changed everytime the agent restarts. |
||
REMOTING_OTEL_METRIC_FILTER |
false |
"system\.cpu\..*" |
Set regex filter for metrics. The metrics whose name match the regex will be collected. The default value is ".*" and collect all the metrics. |
Specification
Resource
Following resource attributes will be provided.
key | value | description |
---|---|---|
service_namespace |
"jenkins" |
This value will be configurable in the future. |
service_namespace |
"jenkins-agent" |
This value will be configurable in the future. |
service_instance_id |
Node name |
Logs
Only logs emitted via java.util.logging
will be collected for now.
Following attributes will be provided.
key | example | description |
---|---|---|
log.level |
INFO |
Log level name. See |
code.namespace |
hudson.remoting.jnlp.Main$CuiListener |
The name of the class that (allegedly) issued the logging request. |
code.function |
status |
The name of the method that (allegedly) issued the logging request. |
exception.type |
java.io.IOException |
The class name of the throwable associated with the log record. |
exception.message |
Broken pipe |
The detail message string of the throwable associated with the log record. |
exception.stacktrace |
java.io.IOException: Broken pipe at hudson.remoting.Engine.innerRun(Engine.java:784) at hudson.remoting.Engine.run(Engine.java:575) |
The stacktrace the throwable associated with the log record. |
Metrics
Following metrics will be collected.
metrics |
unit |
label key |
label value |
description |
jenkins.agent.connection.establishments.count |
1 |
The count of connection establishments. The value will be reset when the agent restarts. |
||
system.cpu.load |
1 |
System CPU load. See |
||
system.cpu.load.average.1m |
System CPU load average 1 minute See |
|||
system.memory.usage |
byte |
state |
|
see |
system.memory.utilization |
1 |
System memory utilization, see |
||
system.paging.usage |
byte |
state |
|
see |
system.paging.utilization |
1 |
see |
||
system.filesystem.usage |
byte |
device |
(identifier) |
System level filesystem usage. Linux only (get mount data from /proc/mounts). |
state |
|
|||
type |
|
|||
mode |
|
|||
mountpoint |
(path) |
|||
system.filesystem.utilization |
1 |
device |
(identifier) |
System level filesystem utilization (0.0 to 1.0). Linux only (get mount data from /proc/mounts). |
state |
|
|||
type |
|
|||
mode |
|
|||
mountpoint |
(path) |
|||
process.cpu.load |
% |
Process CPU load. See |
||
process.cpu.time |
ns |
Process CPU time. See |
||
runtime.jvm.memory.area |
bytes |
type |
|
see MemoryUsage |
area |
|
|||
runtime.jvm.memory.pool |
bytes |
type |
|
see MemoryUsage |
pool |
|
|||
runtime.jvm.gc.time |
ms |
gc |
|
|
runtime.jvm.gc.count |
1 |
gc |
|
Contributing
Refer to our contribution guidelines.
LICENSE
Licensed under MIT, see LICENSE