Module 3: Distributed tracing and OpenTelemetry

With metrics from Module 1 and centralized logging from Module 2, you can detect issues and understand what went wrong. However, a challenge remains: when your application experiences slow response times, metrics show overall slowness and logs show individual service behavior, but neither reveals where in the call chain the delay originates.

Your organization’s application—the frontend, backend, database, and notifier services—processes every user request as a chain of HTTP calls across multiple services. When performance degrades, you need to see the complete path of each request, including the time spent in each hop.

In this module, you will learn distributed tracing concepts, verify the Tempo tracing backend, understand how the Go application is instrumented with the OpenTelemetry SDK, and then activate the full telemetry pipeline end-to-end. By the end, you will observe live traces, metrics, and logs flowing from all four services through the OpenTelemetry Collector to Tempo, Prometheus, and Loki—and correlate them across all three signals in real time.

Learning objectives

By the end of this module, you’ll be able to:

  • Understand distributed tracing concepts (spans, traces, context propagation)

  • Understand how Tempo stores and serves trace data

  • Understand the two-tier sidecar-to-central OpenTelemetry Collector architecture

  • Create a sidecar OpenTelemetryCollector CR and configure its pipeline

  • Create an Instrumentation CR for zero-code agent injection

  • Enable OpenTelemetry on the workshop application

  • Observe live traces in the Observe → Traces console UI and query them with TraceQL

  • Understand how the central collector fans out telemetry to Tempo, Prometheus, and Loki

  • Enable zero-code Python auto-instrumentation on the notifier service

  • Observe a four-hop trace spanning Go and Python services in a single waterfall

Understanding distributed tracing

Before enabling tracing, you need to understand how distributed tracing works in microservice architectures.

Core tracing concepts

Trace: The complete journey of a single request through your system.

  • Example: A user submits a note → frontend service → backend service → database service

  • A trace captures the entire chain, start to finish, with timing and status at each step.

Span: One unit of work within a trace.

  • Example: backend: POST /api/notes — one hop in the note-creation trace

  • Contains: operation name, start time, duration, HTTP status code, and key-value attributes

Context propagation: The mechanism that ties spans together across service boundaries.

  • Each service receives an incoming trace context via the traceparent HTTP header

  • It creates a child span linked to the caller’s span, then propagates the context further downstream

  • Uses the W3C Trace Context standard (traceparent, tracestate headers)

Parent-child relationships: Spans form a tree.

  • Root span: frontend: GET /new-note

  • Child span: backend: POST /api/notes (called by frontend)

  • Grandchild span: database: POST /api/events (called by backend)

Why distributed tracing matters

Without tracing: You see symptoms, not causes.

  • Metrics show: 95th-percentile API response time increased from 80ms to 950ms

  • Logs show: all three services logged warnings at the same time

  • Question: Which service is actually slow?

With tracing: You see the complete picture.

  • Trace shows: frontend→backend took 20ms; backend→database took 880ms (bottleneck found)

  • Other services operated normally

  • Answer: The database service query needs optimization

Tempo architecture

Tempo is a distributed tracing backend optimized for storing and querying traces on cost-effective object storage.

Key components:

  • Distributor: Receives trace data from instrumented applications over OTLP (gRPC port 4317, HTTP port 4318)

  • Ingester: Buffers spans in memory and writes them to object storage

  • Querier: Serves trace queries by reading from both the ingester cache and object storage in parallel

  • Query Frontend: Load-balances query traffic across Querier pods

  • Compactor: Merges and optimizes stored trace blocks over time

Storage: Tempo requires S3-compatible object storage. In this workshop the TempoStack uses in-cluster object storage provisioned by the OpenShift Data Foundation (ODF) NooBaa Multi-Cloud Gateway.

Integration: Tempo is queried by the Distributed Tracing UI plugin built into the OpenShift console via the Cluster Observability Operator. There is no separate Jaeger pod—the query interface is embedded directly in the console.

Understanding the OpenTelemetry collector architecture

Before enabling telemetry, understand how data flows from the application to each observability backend.

Two-tier collector pattern

This workshop uses a two-tier OpenTelemetry Collector topology. Every signal—traces, metrics, and logs—flows through the same two hops before reaching a purpose-built backend:

Application pods (%OPENSHIFT_USERNAME%-observability-demo)
  +--------------------------------------------------------+
  |  frontend / backend / database / notifier              |
  |  +----------+  OTLP HTTP (localhost:4318)              |
  |  | app      |---> otc-container (sidecar collector)   |
  |  +----------+                                          |
  +--------------------------------------------------------+
         | All signals: traces + metrics + logs
         | OTLP gRPC (cluster DNS, port 4317)
         v
  central-collector-collector.observability-demo.svc:4317
  +------------------------------------------------------------------+
  |  Central collector – deployment mode, 2 replicas                 |
  |  (namespace: observability-demo)                                 |
  |                                                                  |
  |  Signal routing:                                                 |
  |  traces  --> otlp/tempo     --> TempoStack distributor :4317     |
  |  traces  --> spanmetrics    --> traces/spanmetrics pipeline       |
  |  metrics --> prometheusremotewrite                               |
  |              --> COO Prometheus /api/v1/write :9090              |
  |  logs    --> otlphttp/logs                                       |
  |              --> LokiStack gateway :8080 (application tenant)    |
  +------------------------------------------------------------------+

Why two tiers?

  • Sidecar collector: Runs within the application pod as a second container (otc-container). The app sends to localhost:4318 (no network hop). The sidecar enriches every span, metric data point, and log record with Kubernetes metadata (k8s.pod.name, k8s.deployment.name, k8s.namespace.name) via the k8sattributes processor, then forwards all three signals over a single gRPC connection to the central collector.

  • Central collector: Runs as a shared Deployment (2 replicas) in observability-demo. It receives all signals from every user’s sidecar and routes them to different backends using different protocols and auth mechanisms. It also runs the spanmetrics connector, which generates RED metrics directly from incoming trace spans.

This pattern keeps the sidecar simple (no secrets, no TLS config, no auth tokens) while centralising the complex backend integrations in one place.

Processors in the sidecar collector

The sidecar uses four processors in order:

Processor Function

memory_limiter

Prevents the collector from consuming more than 75% of available pod memory

resourcedetection

Detects OpenShift infrastructure attributes (such as k8s.cluster.name) and adds them to all telemetry

k8sattributes

Calls the Kubernetes API to attach pod name, namespace, deployment name, node name, and pod UID to every span, metric, and log record

batch

Accumulates records before sending to reduce network overhead

Span metrics connector in the central collector

The central collector uses a spanmetrics connector to generate RED metrics automatically from incoming traces:

  • Rate: traces_spanmetrics_calls_total — request count per service and operation

  • Errors: traces_spanmetrics_calls_total{status.code="STATUS_CODE_ERROR"} — error count

  • Duration: traces_spanmetrics_latency_bucket — latency histogram with configurable buckets

These metrics are published to the COO-managed Prometheus instance via prometheusremotewrite. The COO MonitoringStack has enableRemoteWriteReceiver: true set, which activates the /api/v1/write endpoint that Prometheus exposes for ingest.

Exercise 1: Verify Tempo deployment

The Tempo distributed tracing stack was deployed via GitOps as part of the workshop infrastructure. Verify that all components are running and ready.

Steps

  1. Verify the Tempo Operator is running:

    oc get pods -n openshift-tempo-operator
    Expected output
    NAME                                          READY   STATUS    RESTARTS   AGE
    tempo-operator-controller-manager-xxxxx       2/2     Running   0          1h
  2. Check the TempoStack instance:

    oc get tempostack -n openshift-tempo-operator
    Expected output
    NAME    AGE   CONDITION
    tempo   1h    Ready

    The CONDITION must be Ready before traces can be ingested.

  3. Verify each Tempo component is running:

    oc get pods -n openshift-tempo-operator -l app.kubernetes.io/instance=tempo
    Expected output
    NAME                                        READY   STATUS    RESTARTS   AGE
    tempo-tempo-compactor-xxxxx                 1/1     Running   0          1h
    tempo-tempo-distributor-xxxxx               1/1     Running   0          1h
    tempo-tempo-ingester-0                      1/1     Running   0          1h
    tempo-tempo-querier-xxxxx                   1/1     Running   0          1h
    tempo-tempo-query-frontend-xxxxx            1/1     Running   0          1h
  4. Confirm the distributor service endpoint (used by the OpenTelemetry Collector to write traces):

    oc get svc tempo-tempo-distributor -n openshift-tempo-operator
    Expected output
    NAME                      TYPE        CLUSTER-IP      PORT(S)
    tempo-tempo-distributor   ClusterIP   172.30.x.x      4317/TCP, 4318/TCP

    Port 4317 is OTLP gRPC and port 4318 is OTLP HTTP. The central OpenTelemetry Collector forwards traces to this endpoint.

Verify

Check that your tracing infrastructure is operational:

  • ✓ Tempo Operator pod is Running (2/2 containers)

  • ✓ TempoStack instance condition is Ready

  • ✓ All Tempo component pods (distributor, ingester, querier, query-frontend, compactor) are Running

  • ✓ Tempo distributor service exposes ports 4317 and 4318

What you learned: The TempoStack operator deploys and manages all Tempo components. The distributor is the write endpoint; the query-frontend is the read endpoint used by the OpenShift console UI plugin.

Exercise 2: Explore the application’s OpenTelemetry instrumentation

The three Go services—frontend, backend, and database—are already instrumented with the OpenTelemetry SDK. Telemetry generation is gated by an environment variable so it can be enabled without rebuilding the container image.

Steps

  1. Inspect the shared telemetry package:

    The repository contains a shared telemetry package used by all three services at src/telemetry/telemetry.go.

    Open the Source Code tab in the workshop application (the running frontend) and navigate to telemetry/telemetry.go to browse the file with syntax highlighting.

    Key points in this package:

    • Enabled() function: Returns true only when OTEL_ENABLED=true is set in the environment. All SDK initialization is skipped when false, so the application behaves identically to an uninstrumented binary.

    • Setup() function: Initializes three OTLP HTTP exporters when enabled—trace, metric, and log—all targeting OTEL_EXPORTER_OTLP_ENDPOINT. Once telemetry is enabled in Exercise 6, this will point to http://localhost:4318 (the injected sidecar collector).

  2. Inspect the backend service instrumentation:

    Open the Source Code tab in the workshop application and select backend/main.go to browse the file directly.

    You will see three instrumentation layers activated when OTEL_ENABLED=true:

    Layer Purpose

    telemetry.Setup()

    Creates the global TraceProvider, MeterProvider, and LoggerProvider from the OTLP exporters

    otelslog.NewHandler()

    Bridges the Go standard slog logger to the OTel log exporter—every structured log line is emitted as an OTel log record carrying the active trace ID

    otelhttp.NewTransport()

    Wraps the outbound HTTP client so the W3C traceparent header is injected into every downstream call

  3. Understand the server-side instrumentation:

    The inbound HTTP handler is wrapped with otelhttp.NewHandler(), which:

    • Creates a server span for every inbound request

    • Extracts the incoming traceparent header and registers this span as a child of the calling service’s span

    • Automatically records HTTP attributes (http.method, http.route, http.status_code) on the span

  4. Understand context propagation across services:

    The trace context flows automatically through your microservices:

    Browser
      +-- frontend (otelhttp server span: GET /new-note)
           |  injects traceparent into outbound request
           +-- backend (otelhttp server span: POST /api/notes)
                |  injects traceparent into outbound request
                +-- database (otelhttp server span: POST /api/events)

    Because every service uses otelhttp for both inbound (handler) and outbound (transport) HTTP calls, the trace ID and parent span ID are automatically threaded through the entire call chain with no manual span creation required in business logic code.

  5. Review the enable-otel.yaml patch file:

    Open the Source Code tab in the workshop application and select enable-otel.yaml to browse the full patch file.

    Or view it in the terminal:

    This file contains three Deployment patches—one per service. When applied, each patch:

    • Sets OTEL_ENABLED=true to activate the SDK

    • Sets OTEL_SERVICE_NAME to the service name (becomes the service.name resource attribute)

    • Sets OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 to send telemetry to the injected sidecar

    • Adds the pod annotation sidecar.opentelemetry.io/inject: "sidecar" to trigger sidecar injection

    • Sets serviceAccountName: otel-collector-sidecar for RBAC access to the Kubernetes API

  6. Verify the current state of the deployments:

    oc get deployment frontend backend \
      -n %OPENSHIFT_USERNAME%-observability-demo \
      -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.template.spec.containers[*].name'
    oc get statefulset database \
      -n %OPENSHIFT_USERNAME%-observability-demo \
      -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.template.spec.containers[*].name'
    Expected output
    NAME       CONTAINERS
    frontend   frontend
    backend    backend
    NAME       CONTAINERS
    database   database

    Each service currently has a single application container. After enabling OpenTelemetry later in this module, each pod will gain a second container—the injected sidecar collector.

Verify

  • src/telemetry/telemetry.go gates all SDK initialization on OTEL_ENABLED=true

  • otelhttp.NewHandler() creates server spans and extracts incoming trace context

  • otelhttp.NewTransport() injects the traceparent header into all outbound HTTP calls

  • otelslog.NewHandler() bridges Go structured logs to OTel log records

  • ✓ Current deployments have one container each (no OTel yet)

What you learned: Effective OpenTelemetry Go instrumentation uses three layers—trace provider, otelhttp middleware, and log bridge—to emit traces, metrics, and correlated logs from a single SDK setup call. The otelhttp transport ensures W3C trace context propagation across all service boundaries automatically, without any manual span creation in application code.

Exercise 3: Verify the OpenTelemetry operator and pre-deployed components

Before creating resources in your namespace, confirm the OpenTelemetry Operator and the shared observability-demo infrastructure are healthy.

Steps

  1. Verify the OpenTelemetry Operator pod is running:

    oc get pods -n openshift-operators \
      -l app.kubernetes.io/name=opentelemetry-operator
    Expected output
    NAME                                          READY   STATUS    RESTARTS   AGE
    opentelemetry-operator-controller-xxxxx       2/2     Running   0          1h
  2. Confirm the operator registered the CRDs:

    oc api-resources | grep opentelemetry
    Expected output
    instrumentations              opentelemetry.io   true    Instrumentation
    opentelemetrycollectors       opentelemetry.io   true    OpenTelemetryCollector
  3. Inspect the pre-deployed central collector:

    oc get opentelemetrycollector central-collector -n observability-demo
    Expected output
    NAME               MODE         VERSION
    central-collector  deployment   0.140.0-2
  4. Verify the central collector pods are running:

    oc get pods -n observability-demo \
      -l app.kubernetes.io/name=central-collector-collector
    Expected output
    NAME                                      READY   STATUS    RESTARTS   AGE
    central-collector-collector-xxxxx         1/1     Running   0          1h
    central-collector-collector-xxxxx         1/1     Running   0          1h

    Two replicas provide resilience for the shared collection endpoint.

  5. Inspect the central collector service (this is where sidecars forward telemetry):

    oc get svc central-collector-collector -n observability-demo
    Expected output
    NAME                           TYPE        CLUSTER-IP      PORT(S)
    central-collector-collector    ClusterIP   172.30.x.x      4317/TCP, 4318/TCP

Verify

  • ✓ OpenTelemetry Operator pod is Running (2/2)

  • OpenTelemetryCollector and Instrumentation CRDs are registered

  • central-collector exists in observability-demo in deployment mode with 2 replicas

  • central-collector-collector service exposes ports 4317 and 4318

What you learned: The central collector is pre-deployed in the shared observability-demo namespace by GitOps. Your task is to create the per-namespace sidecar collector and Instrumentation CR in your own namespace, then wire the application into that pipeline.

Exercise 4: Create the sidecar collector in your namespace

You will deploy a sidecar-mode OpenTelemetryCollector CR in your %OPENSHIFT_USERNAME%-observability-demo namespace. When this CR exists, the OpenTelemetry Operator automatically injects a sidecar container into any pod in the namespace that carries the annotation sidecar.opentelemetry.io/inject: "sidecar".

Steps

  1. Verify the required ServiceAccount is present in your namespace:

    oc get serviceaccount otel-collector-sidecar \
      -n %OPENSHIFT_USERNAME%-observability-demo

    This ServiceAccount was pre-created for your namespace with the RBAC permissions needed by the k8sattributes and resourcedetection processors (read access to pods, namespaces, and nodes).

    Expected output
    NAME                    SECRETS   AGE
    otel-collector-sidecar  0         1h
  2. Create the sidecar OpenTelemetryCollector CR:

    cat <<EOF | oc apply -f -
    apiVersion: opentelemetry.io/v1beta1
    kind: OpenTelemetryCollector
    metadata:
      name: sidecar
      namespace: %OPENSHIFT_USERNAME%-observability-demo
    spec:
      mode: sidecar
      serviceAccount: otel-collector-sidecar
      config:
        receivers:
          otlp:
            protocols:
              grpc:
                endpoint: 0.0.0.0:4317
              http:
                endpoint: 0.0.0.0:4318
        processors:
          memory_limiter:
            check_interval: 1s
            limit_percentage: 75
            spike_limit_percentage: 15
          resourcedetection:
            detectors: [openshift]
            timeout: 2s
          k8sattributes:
            auth_type: serviceAccount
            passthrough: false
            extract:
              metadata:
                - k8s.namespace.name
                - k8s.deployment.name
                - k8s.node.name
                - k8s.pod.name
                - k8s.pod.uid
          batch:
            timeout: 10s
            send_batch_size: 1024
        exporters:
          otlp_grpc:
            endpoint: central-collector-collector.observability-demo.svc:4317
            tls:
              insecure: true
            sending_queue:
              enabled: true
              queue_size: 5000
            retry_on_failure:
              enabled: true
              initial_interval: 5s
              max_interval: 30s
              max_elapsed_time: 10m
        service:
          pipelines:
            traces:
              receivers: [otlp]
              processors: [memory_limiter, resourcedetection, k8sattributes, batch]
              exporters: [otlp_grpc]
            metrics:
              receivers: [otlp]
              processors: [memory_limiter, resourcedetection, k8sattributes, batch]
              exporters: [otlp_grpc]
            logs:
              receivers: [otlp]
              processors: [memory_limiter, resourcedetection, k8sattributes, batch]
              exporters: [otlp_grpc]
    EOF
  3. Verify the CR was accepted:

    oc get opentelemetrycollector sidecar -n %OPENSHIFT_USERNAME%-observability-demo
    Expected output
    NAME     MODE    VERSION
    sidecar  sidecar 0.140.0-2

    In sidecar mode, the operator does not create a standalone Deployment. Instead it stores the container spec and injects it into pods on demand when the annotation is detected.

Understand the pipeline

The sidecar carries all three signal types over the same processor chain:

              otlp receiver  (localhost:4317 gRPC / localhost:4318 HTTP)
                    |
         memory_limiter  -- drop if pod memory > 75%
                    |
         resourcedetection -- add k8s.cluster.name, cloud.platform
                    |
         k8sattributes   -- add k8s.pod.name, k8s.deployment.name,
                            k8s.namespace.name, k8s.pod.uid
                    |
              batch        -- buffer and flush (max 10s / 1024 records)
                    |
            otlp_grpc exporter -> central-collector-collector.observability-demo.svc:4317
                      (gRPC, insecure + retry queue for resiliency)

The sidecar does not route signals to different destinations—that responsibility belongs to the central collector. All signals arrive at the central on a single gRPC stream.

Verify

  • otel-collector-sidecar ServiceAccount exists in %OPENSHIFT_USERNAME%-observability-demo

  • sidecar OpenTelemetryCollector CR exists in sidecar mode

  • ✓ CR configuration includes all three pipelines (traces, metrics, logs)

  • ✓ Exporter is otlp_grpc with queue/retry enabled and endpoint central-collector-collector.observability-demo.svc:4317

Exercise 5: Create the Instrumentation CR in your namespace

The Instrumentation CR is a template that tells the OpenTelemetry Operator how to configure the auto-instrumentation agent init-container when injecting into pods. You need one per namespace. In this exercise you create it for %OPENSHIFT_USERNAME%-observability-demo—it will be used in Exercise 9 when you enable Python auto-instrumentation on the notifier service.

Steps

  1. Create the Instrumentation CR:

    cat <<EOF | oc apply -f -
    apiVersion: opentelemetry.io/v1alpha1
    kind: Instrumentation
    metadata:
      name: my-instrumentation
      namespace: %OPENSHIFT_USERNAME%-observability-demo
    spec:
      exporter:
        endpoint: http://localhost:4318
      sampler:
        type: parentbased_traceidratio
        argument: "1.0"
      propagators:
        - tracecontext
        - baggage
      python:
        env:
          - name: OTEL_EXPORTER_OTLP_PROTOCOL
            value: http/protobuf
    EOF

    Key fields explained:

    • spec.exporter.endpoint: http://localhost:4318 — the auto-instrumented process sends telemetry to the sidecar on localhost (both containers live in the same pod)

    • spec.sampler.type: parentbased_traceidratio with argument: "1.0" — 100% sampling, suitable for a workshop

    • spec.propagators: [tracecontext, baggage] — W3C Trace Context headers so the incoming traceparent from the Go backend is read and spans are linked into the existing trace

    • spec.python.env OTEL_EXPORTER_OTLP_PROTOCOL: http/protobuf — forces the Python SDK to use HTTP/protobuf rather than gRPC, matching the sidecar receiver on port 4318

  2. Verify the CR was accepted:

    oc get instrumentation my-instrumentation -n %OPENSHIFT_USERNAME%-observability-demo
    Expected output
    NAME                 AGE
    my-instrumentation   10s

Verify

  • my-instrumentation Instrumentation CR exists in %OPENSHIFT_USERNAME%-observability-demo

  • spec.exporter.endpoint is http://localhost:4318

What you learned: The Instrumentation CR is a per-namespace configuration template. Unlike the sidecar CR (which provides a container spec merged into application pods), the Instrumentation CR is referenced by pods via the instrumentation.opentelemetry.io/inject-<language> annotation — the Operator reads it and injects a language-specific agent init-container plus the necessary environment variables with no application changes required.

Exercise 6: Enable OpenTelemetry on the applications

Now you’ll activate the OpenTelemetry SDK in all three Go services. This adds the sidecar injection annotation, the SDK environment variables, and the correct ServiceAccount to each deployment without rebuilding container images.

Steps

  1. Set your namespace as a shell variable:

    NAMESPACE="%OPENSHIFT_USERNAME%-observability-demo"
  2. Add the OTEL environment variables to all three services:

    for app in frontend backend; do
      oc set env deployment/${app} -n ${NAMESPACE} \
        OTEL_ENABLED=true \
        OTEL_SERVICE_NAME=${app} \
        OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
        OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    done
    oc set env statefulset/database -n ${NAMESPACE} \
      OTEL_ENABLED=true \
      OTEL_SERVICE_NAME=database \
      OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
      OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    Expected output
    deployment.apps/frontend updated
    deployment.apps/backend updated
    statefulset.apps/database updated
  3. Add the sidecar injection annotation to each pod template:

    for app in frontend backend; do
      oc patch deployment/${app} -n ${NAMESPACE} \
        --type=strategic \
        -p='{"spec":{"template":{"metadata":{"annotations":{"sidecar.opentelemetry.io/inject":"sidecar"}}}}}'
    done
    oc patch statefulset/database -n ${NAMESPACE} \
      --type=strategic \
      -p='{"spec":{"template":{"metadata":{"annotations":{"sidecar.opentelemetry.io/inject":"sidecar"}}}}}'

    The annotation value sidecar must match the name of the OpenTelemetryCollector CR you created in Exercise 4. When the operator sees this annotation on a pod being created, it injects the collector container spec from that CR.

    Expected output
    deployment.apps/frontend patched
    deployment.apps/backend patched
    statefulset.apps/database patched
  4. Set the ServiceAccount on all three services so the injected sidecar can call the Kubernetes API:

    for app in frontend backend; do
      oc set serviceaccount deployment/${app} \
        otel-collector-sidecar \
        -n ${NAMESPACE}
    done
    oc set serviceaccount statefulset/database \
      otel-collector-sidecar \
      -n ${NAMESPACE}
    Expected output
    deployment.apps/frontend serviceaccount updated
    deployment.apps/backend serviceaccount updated
    statefulset.apps/database serviceaccount updated
  5. Monitor the rolling restart:

    database is a StatefulSet with a ReadWriteOnce PVC. Its default RollingUpdate strategy terminates the existing pod before starting the replacement, so the volume is cleanly released. Expect a few seconds of database unavailability during this step.

    oc rollout status deployment/frontend deployment/backend \
      -n ${NAMESPACE}
    oc rollout status statefulset/database \
      -n ${NAMESPACE}
    Expected output
    deployment "frontend" successfully rolled out
    deployment "backend" successfully rolled out
    statefulset rolling to 1 pods at revision database-xxxxx
    waiting for statefulset rolling update to complete 0 pods at revision database-xxxxx...
    statefulset rolling update complete 1 pods at revision database-xxxxx...
  6. Verify sidecar injection occurred:

    oc get pods -n ${NAMESPACE} \
      -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.containers[*].name'
    Expected output
    NAME                        CONTAINERS
    frontend-xxxxx              frontend, otc-container
    backend-xxxxx               backend, otc-container
    database-0                  database, otc-container

    The otc-container is the injected OpenTelemetry Collector sidecar. Each pod now has two containers: the application and its dedicated collector.

  7. Confirm the sidecar container is running and pipelines have started:

    oc logs -n ${NAMESPACE} \
      -l app=frontend -c otc-container | tail -10
    Expected output (excerpt)
    Everything is ready. Begin running and processing data.
    Pipeline started (traces).
    Pipeline started (metrics).
    Pipeline started (logs).
  8. Generate application traffic to produce telemetry:

    FRONTEND_URL=$(oc get route frontend \
      -n ${NAMESPACE} \
      -o jsonpath='{.spec.host}')
    
    for i in $(seq 1 30); do
      curl -sk -o /dev/null "https://$FRONTEND_URL/"
      curl -sk -o /dev/null -X POST "https://$FRONTEND_URL/api/notes" \
        -H "Content-Type: application/json" \
        -d "{\"title\":\"Test $i\",\"content\":\"OTel test\"}"
      sleep 1
    done
    echo "Traffic generation complete"

    This sends 30 iterations of mixed GET and POST requests, generating spans at each hop through frontend → backend → database.

Verify

  • ✓ OTEL environment variables set on all three deployments

  • ✓ Sidecar injection annotation present on all three pod templates

  • ✓ All three deployments rolled out successfully

  • ✓ Each pod has two containers: the application and otc-container (sidecar)

  • ✓ Sidecar logs show all three pipelines started

  • ✓ Traffic generated against the frontend route

What you learned: The OpenTelemetry Operator’s sidecar injection mechanism transforms a running pod by adding a pre-configured collector container, without rebuilding the image. The annotation sidecar.opentelemetry.io/inject: "sidecar" tells the operator to use the sidecar CR you created in Exercise 4 as the container specification.

Exercise 7: View and explore live traces

With traffic flowing and telemetry active, you can now view real traces from your application in the OpenShift console and use TraceQL to query them.

Steps

  1. Navigate to ObserveTraces in the OpenShift console.

  2. Set the search parameters:

    • Namespace: %OPENSHIFT_USERNAME%-observability-demo

    • Time range: Last 5 minutes

      Click Search or the refresh icon.

      The scatter plot should now show individual points, one per trace, with y-axis showing duration in milliseconds.

  3. Review the scatter plot and trace list:

    Each point in the scatter plot represents a single trace:

    • X-axis: trace start time

    • Y-axis: total trace duration (ms)

    • Bubble size: number of spans in the trace

      Clusters of points at the top of the chart indicate slow traces. The trace list below shows:

    • Trace name: root span operation (for example, frontend: GET /)

    • Spans: total number of spans in the trace

    • Duration: end-to-end time

    • Start time: when the root span began

      Click one of the higher points to open the trace detail view.

  4. Explore the trace waterfall:

    The waterfall shows a horizontal bar for each span, indented to reflect parent-child relationships. The otelhttp library creates two spans per service boundary: a server span named after the matched route (POST /api/notes) and a client span named after the HTTP method (HTTP POST) for each outbound call:

    frontend: POST /api/notes            [=====================================] 134ms
      frontend: HTTP POST                [===================================] 130ms
        backend: POST /api/notes         [========================] 100ms
          backend: HTTP POST             [================] 56ms
            database: POST /notes        [===============] 55ms
          backend: HTTP POST             [========] 25ms

    Bar length represents duration. The server span for each service is the parent of that service’s outgoing client spans.

  5. Click a span to expand its attributes:

    The attributes on any span come from three distinct layers—the application itself, the Go OTel SDK semantic conventions, and the sidecar collector processors.

    Frontend span (frontend: POST /api/notes)

    Attribute Value / description

    http.method

    GET

    http.status_code

    200

    http.route

    Matched route pattern (for example, /)

    baggage.client.platform

    web — set by the frontend baggage middleware and propagated to every downstream span via the W3C baggage header

    baggage.request.source

    workshop-demo — identifies the request origin; visible on all three service spans in this trace

    Backend span (backend: POST /api/notes)

    Attribute Value / description

    http.method

    POST

    http.status_code

    200

    db.system

    chainsql — identifies the downstream data store

    db.operation

    INSERT — derived from the HTTP method (POST → INSERT, GET → SELECT)

    db.sql.table

    notes — extracted from the request path

    peer.service

    database — the logical name of the service called

    net.peer.name

    database — DNS hostname of the database service

    db.response.status_code

    HTTP status returned by the database service

    baggage.client.platform

    web — forwarded from the W3C baggage HTTP header

    baggage.request.source

    workshop-demo — forwarded from the W3C baggage HTTP header

    Database span (database: POST /notes)

    Attribute Value / description

    db.system

    chainsql

    db.operation

    INSERT

    db.sql.table

    notes

    note.id

    Unique ID assigned to the created note record

    note.title

    Title string from the request body

    note.content.length

    Character length of the note content

    event.id

    ID of the audit event recorded alongside the note

    event.source

    Service name that triggered the event

    event.http_status

    HTTP status of the original request that created the event

    Sidecar-added attributes (present on every span regardless of service):

    • Kubernetes attributes (k8sattributes processor): k8s.pod.name, k8s.deployment.name, k8s.namespace.name

    • Resource attributes (resourcedetection processor): cloud.platform, k8s.cluster.name

  6. Observe W3C Baggage propagation:

    Baggage is a key-value store carried inside the baggage HTTP header alongside traceparent. The frontend sets two members before every outbound request:

    baggage: client.platform=web,request.source=workshop-demo

    Each downstream service reads this header and records every baggage member as a span attribute prefixed with baggage.. This means the same logical context—where the request came from—is searchable as a span attribute on every service in the chain.

    All TraceQL queries below include { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" } to scope results to your namespace. The Tempo backend is shared across all workshop users, so this filter is required to see only your traces.

    Filter all traces that originated from the web frontend:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["baggage.client.platform"] = "web" }

    Or find every span produced by the workshop demo requests, regardless of service:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["baggage.request.source"] = "workshop-demo" }
  7. Use TraceQL to find slow database spans:

    Click Show query beneath the filter bar to reveal the TraceQL editor. TraceQL is the Tempo query language, similar to PromQL for Prometheus or LogQL for Loki.

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && resource.service.name = "database" && duration > 50ms }

    This filters to only traces in your namespace where the database service had at least one span exceeding 50ms. Because random delays of up to 60ms are injected in the backend and database services, you should see several results.

    Query by the specific table the slow operation touched:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["db.sql.table"] = "notes" && duration > 30ms }

    Or filter by database operation type:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["db.operation"] = "INSERT" && resource.service.name = "database" }
  8. Use TraceQL to surface business-level data:

    The custom note. and event. attributes open the trace store as a queryable record of application events. Find all traces that created a note:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["note.id"] != "" }

    Or find traces by peer service to understand the backend-to-database communication pattern:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["peer.service"] = "database" && span["db.sql.table"] = "notes" }
  9. Find error traces:

    { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && status = error }

    If any requests returned HTTP errors, this will surface the traces where something went wrong, with all spans visible for root cause analysis.

  10. Correlate a trace with logs:

    Note the Trace ID shown at the top of a trace detail view (a 32-character hex string).

    Navigate to ObserveLogs.

    Query for log lines containing that trace ID:

    {kubernetes_namespace_name="%OPENSHIFT_USERNAME%-observability-demo"} |= "<your-trace-id>"

    Because the application uses otelslog.NewHandler(), every structured log line emitted during a traced request carries the active trace ID as a log field. This lets you move directly from a slow span to the exact log lines emitted during that span.

Verify

  • ✓ Traces appear in Observe → Traces for namespace %OPENSHIFT_USERNAME%-observability-demo

  • ✓ Trace waterfall shows spans from all three services (frontend, backend, database)

  • ✓ Frontend spans carry baggage.client.platform=web and baggage.request.source=workshop-demo

  • ✓ Backend spans carry db.system, db.operation, db.sql.table, peer.service, db.response.status_code

  • ✓ Database spans carry note.id, note.title, note.content.length on note-creation requests

  • ✓ All spans include Kubernetes metadata (k8s.pod.name, k8s.deployment.name)

  • ✓ TraceQL query { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["baggage.request.source"] = "workshop-demo" } returns results

  • ✓ A trace ID from the trace view can be found in Observe → Logs

What you learned: The workshop application uses two complementary instrumentation strategies. The sidecar collector processors automatically enrich spans with Kubernetes metadata—no application code required. The Go OTel SDK adds semantic-convention attributes (db., peer., net.) for production-grade observability, business-level attributes (note., event.*) for application-specific queries, and W3C Baggage to propagate contextual metadata across all service boundaries. TraceQL lets you query across infrastructure, application, and business dimensions using a single query language.

Exercise 8: Explore the central collector pipeline

The central collector in observability-demo receives telemetry from all sidecar collectors and routes it to three backends. Inspect the configuration and verify each export path.

Steps

  1. View the central collector configuration:

    oc get opentelemetrycollector central-collector \
      -n observability-demo \
      -o jsonpath='{.spec.config}' | yq .

    Note the four export destinations:

    • otlp/tempotempo-tempo-distributor.openshift-tempo-operator.svc:4317 — traces, TLS + bearer token auth + X-Scope-OrgID: dev header

    • prometheusremotewrite → COO Prometheus /api/v1/write — application metrics pushed from pods

    • prometheusremotewrite (via traces/spanmetrics pipeline) → same endpoint — span-derived RED metrics

    • otlphttp/logs → LokiStack gateway at openshift-logging — logs, OTLP/HTTP JSON + bearer token + service CA TLS

  2. Inspect the spanmetrics connector:

    oc get opentelemetrycollector central-collector \
      -n observability-demo \
      -o jsonpath='{.spec.config}' | yq '.connectors.spanmetrics'

    The spanmetrics connector acts as both an exporter (receiving spans from the traces pipeline) and a receiver (producing metric records for the traces/spanmetrics pipeline). It generates latency histograms and call-count counters for every service.name + span.name combination automatically.

  3. Query the generated span metrics in Prometheus:

    Navigate to ObserveMetrics and enter:

    sum(rate(traces_spanmetrics_calls_total{service_name="frontend"}[5m])) by (span_name)

    This shows the request rate per operation for the frontend service—derived solely from traces, with no Prometheus client library code in the application.

    To see the p95 latency for backend operations:

    histogram_quantile(0.95,
      sum(rate(traces_span_metrics_duration_milliseconds_bucket{k8s_namespace_name="%OPENSHIFT_USERNAME%-observability-demo", k8s_deployment_name="backend"}[5m]))
      by (span_name, le)
    )
  4. Understand the logs pipeline:

    The central collector runs a dedicated logs pipeline that maps OTEL resource attributes to the label keys expected by the LokiStack openshift-logging multi-tenancy gateway:

    processors:
      resource/logs:            (1)
        attributes:
          - {key: kubernetes.namespace_name, from_attribute: k8s.namespace.name, action: upsert}
          - {key: kubernetes.pod_name,       from_attribute: k8s.pod.name,       action: upsert}
          - {key: kubernetes.container_name, from_attribute: k8s.container.name, action: upsert}
          - {key: log_type, value: application, action: upsert}
      transform/logs:           (2)
        log_statements:
          - context: log
            statements:
              - set(attributes["level"], ConvertCase(severity_text, "lower"))
    exporters:
      otlphttp/logs:            (3)
        endpoint: https://logging-loki-gateway-http.openshift-logging.svc.cluster.local:8080/api/logs/v1/application/otlp
        encoding: json
        tls:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
        auth:
          authenticator: bearertokenauth  (4)
    1 LokiStack in openshift-logging tenancy mode requires kubernetes.namespace_name (not k8s.namespace.name) for tenant routing.
    2 Derives a level field from OTEL severity_text (lowercased) for log filtering in the Loki UI.
    3 Sends log records as OTLP/HTTP JSON to the LokiStack gateway’s application tenant endpoint.
    4 Uses the pod’s service account token for bearer auth—the central collector SA has loki.grafana.com/application create rights via ClusterRoleBinding.

    Verify a log record arrived in Loki by navigating to ObserveLogs in the OpenShift console and querying:

    {kubernetes_namespace_name="%OPENSHIFT_USERNAME%-observability-demo"} | json

    You should see structured log records from frontend, backend, and database, each containing traceID and spanID fields that link them to the traces you viewed in Exercise 7.

Verify

  • ✓ Central collector configuration shows otlp/tempo, prometheusremotewrite, and otlphttp/logs exporters

  • spanmetrics connector configuration is visible

  • traces_spanmetrics_calls_total metric exists in Prometheus for your services

  • traces_spanmetrics_latency_bucket histogram is queryable for p95 latency

  • Observe → Logs shows structured log records from %OPENSHIFT_USERNAME%-observability-demo with traceID fields

What you learned: The central collector is a fanout hub—one OTLP receiver, four exporters. Each signal type is independently processed and routed: traces go to Tempo (with TLS and multi-tenancy headers), metrics are remote-written to COO Prometheus, span-derived RED metrics follow the same path via the spanmetrics connector, and logs are attribute-remapped and forwarded to the LokiStack application tenant using service-account bearer auth.

Exercise 9: Zero-code Python auto-instrumentation

The workshop application includes a fourth service: notifier, a Python/FastAPI microservice. The backend calls notifier after every note create, update, or delete operation. The notifier records the event in the database service.

Open src/notifier/app.py and notice what is absent: there are no OpenTelemetry imports of any kind. The file contains only FastAPI route handlers and an httpx call. Yet by the end of this exercise, full traces—including spans for every notifier HTTP call—will appear in the trace waterfall.

Open the Source Code tab in the workshop application and select notifier/app.py to compare it with the Go services.

This demonstrates the difference between the manual SDK approach used by the Go services and the zero-code auto-instrumentation provided by the OpenTelemetry Operator.

Service Language Instrumentation method

frontend

Go

Manual SDK (telemetry.Setup() + otelhttp)

backend

Go

Manual SDK (telemetry.Setup() + otelhttp)

database

Go

Manual SDK (telemetry.Setup() + otelhttp)

notifier

Python

Zero-code: OTel Operator init-container injection

Step 1: Observe the missing notifier span

Before enabling auto-instrumentation, generate traffic and look at a trace waterfall.

  1. Generate several note-creation requests:

    FRONTEND_URL=$(oc get route frontend \
      -n %OPENSHIFT_USERNAME%-observability-demo \
      -o jsonpath='{.spec.host}')
    
    for i in $(seq 1 10); do
      curl -sk -o /dev/null -X POST "https://$FRONTEND_URL/api/notes" \
        -H "Content-Type: application/json" \
        -d "{\"title\":\"Auto-instrumentation test $i\",\"content\":\"Exercise 9\"}"
      sleep 1
    done
  2. Open ObserveTraces in the OpenShift console, set namespace to %OPENSHIFT_USERNAME%-observability-demo, and click a recent trace.

    The waterfall will show three hops: frontend → backend → database. The backend called notifier, but no notifier span appears because the Python process emits no telemetry without an agent.

Step 2: Annotate the notifier deployment

A single oc patch command adds the two annotations that trigger both sidecar injection (same as the Go services) and Python agent injection.

  1. Patch the notifier deployment:

    oc patch deployment notifier \
      -n %OPENSHIFT_USERNAME%-observability-demo \
      --type=json \
      -p='[
        {
          "op": "add",
          "path": "/spec/template/metadata/annotations",
          "value": {
            "sidecar.opentelemetry.io/inject": "sidecar",
            "instrumentation.opentelemetry.io/inject-python": "my-instrumentation"
          }
        },
        {
          "op": "add",
          "path": "/spec/template/spec/serviceAccountName",
          "value": "otel-collector-sidecar"
        }
      ]'

    The annotation value my-instrumentation references the Instrumentation CR you created in Exercise 5 in the same namespace.

    When this annotated pod is scheduled, the OpenTelemetry Operator’s mutating admission webhook injects an init container that downloads opentelemetry-distro and opentelemetry-instrumentation-fastapi into a shared volume. The Python process picks them up via PYTHONPATH and a configurator hook—no code changes or image rebuilds required.

  2. Watch the rollout:

    oc rollout status deployment/notifier -n %OPENSHIFT_USERNAME%-observability-demo
  3. Confirm the pod now has two containers (application + sidecar):

    oc get pods -n %OPENSHIFT_USERNAME%-observability-demo \
      -l app=notifier \
      -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.containers[*].name'
    Expected output
    NAME                        CONTAINERS
    notifier-xxxxx              notifier, otc-container
  4. Verify the Python SDK is active by checking notifier logs:

    oc logs -n %OPENSHIFT_USERNAME%-observability-demo \
      -l app=notifier -c notifier | head -20

    Look for OpenTelemetry bootstrap messages such as Instrumenting FastAPI or OpenTelemetry SDK configured.

Step 3: Generate traffic and observe the four-hop trace

  1. Send another batch of note-creation requests:

    FRONTEND_URL=$(oc get route frontend \
      -n %OPENSHIFT_USERNAME%-observability-demo \
      -o jsonpath='{.spec.host}')
    
    for i in $(seq 1 15); do
      curl -sk -o /dev/null -X POST "https://$FRONTEND_URL/api/notes" \
        -H "Content-Type: application/json" \
        -d "{\"title\":\"Traced note $i\",\"content\":\"With notifier\"}"
      sleep 1
    done
    echo "Done"
  2. Return to ObserveTraces in the console. A note-creation trace will now show a fourth hop. The backend: HTTP POST client span that previously had no child now has a notifier server span beneath it:

    frontend: POST /api/notes              [=====================================] 134ms
      frontend: HTTP POST                  [===================================] 130ms
        backend: POST /api/notes           [========================] 100ms
          backend: HTTP POST               [================] 56ms
            database: POST /notes          [===============] 55ms
          backend: HTTP POST               [========] 25ms
            notifier: POST /notify         [======] 20ms
              notifier: HTTP POST          [===] 10ms

    The notifier spans are produced entirely by opentelemetry-instrumentation-fastapi and opentelemetry-instrumentation-httpxno code was changed in app.py.

  3. Click on the notifier: POST /notify span and inspect its attributes:

    • http.method, http.status_code, http.route — added by FastAPI auto-instrumentation

    • k8s.pod.name, k8s.deployment.name — added by the sidecar k8sattributes processor

    • service.name: notifier — set via the OTEL_SERVICE_NAME env var injected by the operator

  4. Notice that the traceparent context passed from backend to notifier is preserved correctly: the notifier root span appears as a child of the backend span, maintaining the single unified trace tree.

Compare: manual SDK vs auto-instrumentation

Characteristic Go (manual SDK) Python (auto-instrumentation)

Code change required

Yes (telemetry.Setup())

No

Image rebuild required

Yes

No

Activation mechanism

OTEL_ENABLED=true env var

Pod annotation

Span granularity

Full control (custom spans)

Framework-level (HTTP in/out)

Custom attributes

Full control — db., note., event.*, baggage

Limited without code changes

W3C Baggage

Yes — frontend injects, all services propagate

Yes — read from traceparent/baggage header automatically

Best for

Apps with source access

Apps without source access or rapid onboarding

Verify

  • ✓ Notifier pod has two running containers: notifier and otc-container

  • ✓ Notifier logs show OpenTelemetry SDK bootstrap messages

  • ✓ Traces for note creation show four service hops: frontend → backend → notifier → database

  • ✓ Notifier spans are children of the backend span (W3C Trace Context propagation works)

  • k8s.pod.name and k8s.deployment.name attributes are present on notifier spans

  • app.py was not modified at any point in this exercise

What you learned: The Instrumentation CR is a namespace-level template. A single annotation—instrumentation.opentelemetry.io/inject-python—causes the OpenTelemetry Operator’s admission webhook to inject an init-container that downloads and configures the Python agent at pod start. No source changes, no image rebuilds, no SDK imports. The W3C Trace Context standard ensures the notifier’s spans slot directly into the existing trace tree built by the Go services.

Learning outcomes

By completing this module, you should now understand:

  • ✓ A trace is the complete journey of a request; a span is one operation within that trace

  • Context propagation via W3C traceparent headers links spans across service boundaries automatically

  • Tempo stores traces on object storage with distinct distributor (write) and query-frontend (read) components

  • ✓ The sidecar-to-central collector pattern decouples application-facing collection from backend-facing export

  • ✓ A sidecar OpenTelemetryCollector in sidecar mode injects a container into annotated pods without any image changes

  • ✓ The k8sattributes and resourcedetection processors automatically enrich telemetry with Kubernetes and infrastructure metadata

  • TraceQL filters traces by service name, operation, duration, status, and any span attribute

  • ✓ The spanmetrics connector in the central collector generates RED metrics from traces, eliminating the need for a Prometheus client library

  • Auto-instrumentation via the Instrumentation CR enables zero-code telemetry for Python applications

  • ✓ A Python/FastAPI service can produce full traces—including W3C context propagation—with zero source-code changes

Business impact: You now have a complete three-signal observability pipeline for all four microservices. A single request to your application automatically produces:

  • A distributed trace showing the exact call path and per-service timing

  • RED metrics (rate, errors, duration) queryable in Prometheus—without any metrics code in the application

  • Structured log records correlated to the trace via trace ID—without any manual log attribute configuration

Module summary

You activated the full distributed tracing and OpenTelemetry pipeline for the workshop application—from infrastructure verification through to live trace visualization, TraceQL queries, and zero-code Python auto-instrumentation.

What you accomplished:

  • Verified the Tempo distributed tracing backend and understood its component architecture

  • Learned the two-tier sidecar-to-central OpenTelemetry Collector topology

  • Explored the OpenTelemetry SDK instrumentation already built into the Go services

  • Verified the OpenTelemetry Operator and the pre-deployed central collector in observability-demo

  • Created a sidecar OpenTelemetryCollector CR—carries all three signals (traces, metrics, logs) to the central collector over a single gRPC connection

  • Created an Instrumentation CR—used as the agent injection template for Python auto-instrumentation

  • Enabled the Go SDK and triggered sidecar injection on the three Go deployments

  • Analyzed live traces in Observe → Traces, identified span-level bottlenecks, and correlated spans with log records using TraceQL

  • Explored the central collector’s full routing: traces → Tempo, metrics + RED metrics → COO Prometheus remote write, logs → LokiStack application tenant

  • Enabled zero-code Python auto-instrumentation on the notifier service using a single pod annotation

  • Observed a four-hop trace (frontend → backend → notifier → database) with full context propagation across Go and Python services

Key concepts mastered:

  • Trace and span: A trace is a tree of spans; each span represents one service operation with timing and attributes

  • Context propagation: traceparent HTTP headers link spans across service boundaries using the W3C standard

  • TempoStack: Operator-managed, with distributor, ingester, querier, query-frontend, and compactor components

  • Sidecar mode: The operator injects the collector as a second container into annotated pods

  • k8sattributes and resourcedetection: Processors that attach Kubernetes context metadata to all telemetry

  • Two-tier architecture: Sidecar handles application-facing collection; central collector handles backend-facing export and routing

  • spanmetrics connector: Generates Prometheus-queryable RED metrics automatically from trace spans

  • Auto-instrumentation: Instrumentation CR + pod annotation enables agent injection for Python without code changes

  • Cross-language tracing: W3C Trace Context headers propagate correctly between Go (otelhttp) and Python (opentelemetry-instrumentation-httpx), forming a single unified trace tree

Continue to the conclusion to review key takeaways and next steps.