Opentelemetry Grafana Apm Stack

Application Performance Monitoring is the ultimate level of observability in your systems. This comes on top of infra, network and other types of monitoring, providing info about the health and perfrmance of your applications or services. OpenTelemetry is CNCF project providing standard standard way to collect telemetry data. Supports metrics, traces, and logs with vendor-neutral APIs and SDKs. It goes together with Grafana supported stack to store and visualize signals collected from monitored systems. The stack is made of OpenTelemetry SDK/API, Opentelemetry Collector to collect and transfer signals from applications, DBs, servers and other components of your system. Othe part is made of Grafana stack to store, search and visualize signals: Mimir (Prometheus) to store metrics, Loki to store logs and Tempo to store traces. ...

September 5, 2025

Provision Mimir Alert Using Curl

Alertmanager is part of Mimir. It will store rules and check them against the received metrics. When the rule is triggered it will send notification to defined notification chanels. It provides API so you can automate alerts provisioning. You could keep alerts under source control and create them from CICD. Alert Alerts are defined in yaml files. Here is sample: # alert.yaml groups: - name: cpu_alerts interval: 30s rules: - alert: HighCPUUsage expr: system_cpu_time_seconds > 100 for: 1m labels: severity: warning annotations: summary: "High CPU usage detected" description: "CPU time exceeded threshold" Curl Mimir API curl -X POST http://<MIMIR_URL>/api/v1/rules/cpu_alerts \ -H "Content-Type: application/yaml" \ --data-binary @alert.yaml

August 30, 2025

OpenTelemetry Snmp Monitoring

SNMP is used to monitor many different devices like routers, UPSs, Storages, etc… The ultimate SRE startegy is to inify all monitoring systems into one observability platform and get unique place where you see all info needed in case of emergency. Here is how you can use OpenTelemetry to collect SNMP data from your devices and send it to Prometheus or Mimir. Then you can use Grafana to visualize and create alerts. ...

August 30, 2025

Opentelemetry Database Monitoring

Opentelemetry collector can help us to get detailed database metrics querying database system tables and views, parsing the results and creating metrics and logs for observability platform, ie. Prometheus or Mimir for metrics and Loki for logs. Basic requirement could be to see top 10 queries consuming the most of your resources. Strategy OpenTelemetry Configuration Strategy MySql and MS SQL assign unique label to each query (digest or query hash) so you can use this field (column) to correlate metrics to logs. This will improve your data ingestion on the observability platform side. Otherwise, you could end up with metrics with huge labels (full query text) and this can affect performanse or indexing in Prometheus or Mimir. ...

August 30, 2025

Automate Grafana Provisioning

In big deployments Grafana may have big number of Orgs, datasources, dashboards. If you need to automate provisioning of Grafana resources, you may try to create your own scripts and tools or try to find some open source solutions to cover your use case. This comes down to 2 options: use Grafana API or use file based provisioning. Typical Use Case The typical use could be: create Azure AD group for users ...

August 29, 2025

Curl Grafana Api

Useful curl commands to interact with Grafana API. Create ORGs, datasources, etc… # Get all datasources curl -X GET https://GRAFANA_URL/api/datasources -H "Content-Type: application/json" -u 'user:pass' | jq # Get all Organizations curl -X GET https://GRAFANA_URL/api/orgs -H "Content-Type: application/json" -u 'user:pass' | jq # Create new Organization curl -X POST -u 'USER:PASS' -H "Content-Type: application/json" https://GRAFANA_URL/api/orgs -d '{"name": "New Organization"}'

August 9, 2025

Create Grafana Org With Perl

Intro Grafana API allows us to automate admin tasks. We have big number of Orgs mapped to Azure AD groups and we were looking for a way to automate provisioning. Process goes like this: create new Org, provision datasources (with same UIDs), provision dashboards (with same UIDs). In this way all dashboards are working within the Org resusing all URLs, etc… Simple Perl Script to Provision Grafana ORG #! /usr/bin/perl use HTTP::Tiny; use JSON::MaybeXS; use MIME::Base64 qw(encode_base64); my $user = $ENV{'USER'}; my $pass = $ENV{'PASS'}; my %url = ('dev' => 'https://dev-grafana/api/orgs', 'prod' => 'https://prod-grafana/api/orgs'); if(@ARGV != 2) { die "Usage: $0 <org-name> <dev|prod>\n"; } my $org_name = $ARGV[0]; my $environment = $ARGV[1]; die "Environment must be dev or prod\n" if $environment !~ /^(dev|prod)$/; # Create ORG and Get Org ID my $data = { name => $org_name }; my $json = encode_json($data); my $http = HTTP::Tiny->new; my $credentials = encode_base64("$user:$pass", ''); # no newline my $org_id; my $headers = { "Content-Type" => "application/json", "Authorization" => "Basic $credentials" }; my $response = $http->request( 'POST', $url{$environment}, { headers => $headers, content => $json }); if ($response->{success}) { my $resp_data = decode_json($response->{content}); $org_id = $resp_data->{orgId}; } else { die "Failed to create org: $response->{status} $response->{reason}\n"; } print "Created org ID: $org_id\n";

August 9, 2025

What Is O11y

O11y, shorthand for Observability - O + 11 letters + y :-) Observability is the ability to understand the internal state of a system by examining only the data it outputs. Three Pillars Metrics Quantitative measurements collected over time. Useful for dashboards, alerts, and trend analysis. Traces Traces show the journey of a request through a system. Comprised of spans, which represent operations. Useful for understanding latency, bottlenecks, and dependencies. ...

August 2, 2025