Flux Gitops in Gitlab

GitOps is mode of operation where you keep Kubernetes manifests in a repo as source of truth and you use some way to sync that to your cluster. There are several ways and tools you can use, ie. ArgoCD or Flux. Gitlab is using Flux by default. I believe it is more lightweight than ArgoCD and requires les resources in your cluster. This is why it is more suitable for small clusters like k3s or microk8s. You may want to use it if you do not need fancy UI. If you have big team of developers and they do not want to invest any effort to understand k8s logic, use ArgoCD. ...

October 12, 2025

Gitlab Agent Cicd

Gitlab is using 2 different ways to manage and provision resources in your Kubernetes clusters. You can use GitOps way running FluxCD or you can connect your cluster to Gitlab using gitlab-agent and use kubectl commands directly in your .gitlab-ci.yml. If you decide to use gitlab-agent it will install additional POD into your cluster using Helmto keep 2-way communication between cluster and Gitlab. In your Gitlab repo go to section Operate -> Kubernetes clusters and create the new cluster. Save the agentID. ...

October 10, 2025

Automated Test for Opentelemetry Deployment

The biggest chalenge in building Observability platform for big company is how to make strict policy for matrics labeling and automate configuration of Otel Collectors config files. This could be resolved by using some fleet management solution. Until today, I was not able to find full working open source solution, so you may need to create your own. However you resolve the above issue, it is good practice to have automated check after your mass deployment to see what servers picked up new config and sending signals labeled according to your latest configuration. ...

September 29, 2025

Opentelemetry Grafana Apm Stack

Application Performance Monitoring is the ultimate level of observability in your systems. This comes on top of infra, network and other types of monitoring, providing info about the health and perfrmance of your applications or services. OpenTelemetry is CNCF project providing standard standard way to collect telemetry data. Supports metrics, traces, and logs with vendor-neutral APIs and SDKs. It goes together with Grafana supported stack to store and visualize signals collected from monitored systems. The stack is made of OpenTelemetry SDK/API, Opentelemetry Collector to collect and transfer signals from applications, DBs, servers and other components of your system. Othe part is made of Grafana stack to store, search and visualize signals: Mimir (Prometheus) to store metrics, Loki to store logs and Tempo to store traces. ...

September 5, 2025

Provision Mimir Alert Using Curl

Alertmanager is part of Mimir. It will store rules and check them against the received metrics. When the rule is triggered it will send notification to defined notification chanels. It provides API so you can automate alerts provisioning. You could keep alerts under source control and create them from CICD. Alert Alerts are defined in yaml files. Here is sample: # alert.yaml groups: - name: cpu_alerts interval: 30s rules: - alert: HighCPUUsage expr: system_cpu_time_seconds > 100 for: 1m labels: severity: warning annotations: summary: "High CPU usage detected" description: "CPU time exceeded threshold" Curl Mimir API curl -X POST http://<MIMIR_URL>/api/v1/rules/cpu_alerts \ -H "Content-Type: application/yaml" \ --data-binary @alert.yaml

August 30, 2025

OpenTelemetry Snmp Monitoring

SNMP is used to monitor many different devices like routers, UPSs, Storages, etc… The ultimate SRE startegy is to inify all monitoring systems into one observability platform and get unique place where you see all info needed in case of emergency. Here is how you can use OpenTelemetry to collect SNMP data from your devices and send it to Prometheus or Mimir. Then you can use Grafana to visualize and create alerts. ...

August 30, 2025

Opentelemetry Database Monitoring

Opentelemetry collector can help us to get detailed database metrics querying database system tables and views, parsing the results and creating metrics and logs for observability platform, ie. Prometheus or Mimir for metrics and Loki for logs. Basic requirement could be to see top 10 queries consuming the most of your resources. Strategy OpenTelemetry Configuration Strategy MySql and MS SQL assign unique label to each query (digest or query hash) so you can use this field (column) to correlate metrics to logs. This will improve your data ingestion on the observability platform side. Otherwise, you could end up with metrics with huge labels (full query text) and this can affect performanse or indexing in Prometheus or Mimir. ...

August 30, 2025

Automate Grafana Provisioning

In big deployments Grafana may have big number of Orgs, datasources, dashboards. If you need to automate provisioning of Grafana resources, you may try to create your own scripts and tools or try to find some open source solutions to cover your use case. This comes down to 2 options: use Grafana API or use file based provisioning. Typical Use Case The typical use could be: create Azure AD group for users ...

August 29, 2025

Curl Grafana Api

Useful curl commands to interact with Grafana API. Create ORGs, datasources, etc… # Get all datasources curl -X GET https://GRAFANA_URL/api/datasources -H "Content-Type: application/json" -u 'user:pass' | jq # Get all Organizations curl -X GET https://GRAFANA_URL/api/orgs -H "Content-Type: application/json" -u 'user:pass' | jq # Create new Organization curl -X POST -u 'USER:PASS' -H "Content-Type: application/json" https://GRAFANA_URL/api/orgs -d '{"name": "New Organization"}'

August 9, 2025

Create Grafana Org With Perl

Intro Grafana API allows us to automate admin tasks. We have big number of Orgs mapped to Azure AD groups and we were looking for a way to automate provisioning. Process goes like this: create new Org, provision datasources (with same UIDs), provision dashboards (with same UIDs). In this way all dashboards are working within the Org resusing all URLs, etc… Simple Perl Script to Provision Grafana ORG #! /usr/bin/perl use HTTP::Tiny; use JSON::MaybeXS; use MIME::Base64 qw(encode_base64); my $user = $ENV{'USER'}; my $pass = $ENV{'PASS'}; my %url = ('dev' => 'https://dev-grafana/api/orgs', 'prod' => 'https://prod-grafana/api/orgs'); if(@ARGV != 2) { die "Usage: $0 <org-name> <dev|prod>\n"; } my $org_name = $ARGV[0]; my $environment = $ARGV[1]; die "Environment must be dev or prod\n" if $environment !~ /^(dev|prod)$/; # Create ORG and Get Org ID my $data = { name => $org_name }; my $json = encode_json($data); my $http = HTTP::Tiny->new; my $credentials = encode_base64("$user:$pass", ''); # no newline my $org_id; my $headers = { "Content-Type" => "application/json", "Authorization" => "Basic $credentials" }; my $response = $http->request( 'POST', $url{$environment}, { headers => $headers, content => $json }); if ($response->{success}) { my $resp_data = decode_json($response->{content}); $org_id = $resp_data->{orgId}; } else { die "Failed to create org: $response->{status} $response->{reason}\n"; } print "Created org ID: $org_id\n";

August 9, 2025