Module 3: Distributed tracing and OpenTelemetry

With metrics from Module 1 and centralized logging from Module 2, you can detect issues and understand what went wrong. However, a challenge remains: when your application experiences slow response times, metrics show overall slowness and logs show individual service behavior, but neither reveals where in the call chain the delay originates.

Your organization’s application—the frontend, backend, database, and notifier services—processes every user request as a chain of HTTP calls across multiple services. When performance degrades, you need to see the complete path of each request, including the time spent in each hop.

In this module, you will learn distributed tracing concepts, verify the Tempo tracing backend, understand how the Go application is instrumented with the OpenTelemetry SDK, and then activate the full telemetry pipeline end-to-end. By the end, you will observe live traces, metrics, and logs flowing from all four services through the OpenTelemetry Collector to Tempo, Prometheus, and Loki—and correlate them across all three signals in real time.

Table of Contents

Learning objectives
Understanding distributed tracing
Understanding the OpenTelemetry collector architecture
Exercise 1: Verify Tempo deployment
Exercise 2: Explore the application’s OpenTelemetry instrumentation
Exercise 3: Verify the OpenTelemetry operator and pre-deployed components
Exercise 4: Create the sidecar collector in your namespace
Exercise 5: Create the Instrumentation CR in your namespace
Exercise 6: Enable OpenTelemetry on the applications
Exercise 7: View and explore live traces
Exercise 8: Explore the central collector pipeline
Exercise 9: Zero-code Python auto-instrumentation
Learning outcomes
Module summary

Learning objectives

By the end of this module, you’ll be able to:

Understand distributed tracing concepts (spans, traces, context propagation)
Understand how Tempo stores and serves trace data
Understand the two-tier sidecar-to-central OpenTelemetry Collector architecture
Create a sidecar OpenTelemetryCollector CR and configure its pipeline
Create an Instrumentation CR for zero-code agent injection
Enable OpenTelemetry on the workshop application
Observe live traces in the Observe → Traces console UI and query them with TraceQL
Understand how the central collector fans out telemetry to Tempo, Prometheus, and Loki
Enable zero-code Python auto-instrumentation on the notifier service
Observe a four-hop trace spanning Go and Python services in a single waterfall

Understanding distributed tracing

Before enabling tracing, you need to understand how distributed tracing works in microservice architectures.

Core tracing concepts

Trace: The complete journey of a single request through your system.

Example: A user submits a note → frontend service → backend service → database service
A trace captures the entire chain, start to finish, with timing and status at each step.

Span: One unit of work within a trace.

Example: backend: POST /api/notes — one hop in the note-creation trace
Contains: operation name, start time, duration, HTTP status code, and key-value attributes

Context propagation: The mechanism that ties spans together across service boundaries.

Each service receives an incoming trace context via the traceparent HTTP header
It creates a child span linked to the caller’s span, then propagates the context further downstream
Uses the W3C Trace Context standard (traceparent, tracestate headers)

Parent-child relationships: Spans form a tree.

Root span: frontend: GET /new-note
Child span: backend: POST /api/notes (called by frontend)
Grandchild span: database: POST /api/events (called by backend)

Why distributed tracing matters

Without tracing: You see symptoms, not causes.

Metrics show: 95th-percentile API response time increased from 80ms to 950ms
Logs show: all three services logged warnings at the same time
Question: Which service is actually slow?

With tracing: You see the complete picture.

Trace shows: frontend→backend took 20ms; backend→database took 880ms (bottleneck found)
Other services operated normally
Answer: The database service query needs optimization

Tempo architecture

Tempo is a distributed tracing backend optimized for storing and querying traces on cost-effective object storage.

Key components:

Distributor: Receives trace data from instrumented applications over OTLP (gRPC port 4317, HTTP port 4318)
Ingester: Buffers spans in memory and writes them to object storage
Querier: Serves trace queries by reading from both the ingester cache and object storage in parallel
Query Frontend: Load-balances query traffic across Querier pods
Compactor: Merges and optimizes stored trace blocks over time

Storage: Tempo requires S3-compatible object storage. In this workshop the TempoStack uses in-cluster object storage provisioned by the OpenShift Data Foundation (ODF) NooBaa Multi-Cloud Gateway.

Integration: Tempo is queried by the Distributed Tracing UI plugin built into the OpenShift console via the Cluster Observability Operator. There is no separate Jaeger pod—the query interface is embedded directly in the console.

Understanding the OpenTelemetry collector architecture

Before enabling telemetry, understand how data flows from the application to each observability backend.

Two-tier collector pattern

This workshop uses a two-tier OpenTelemetry Collector topology. Every signal—traces, metrics, and logs—flows through the same two hops before reaching a purpose-built backend:

Application pods (%OPENSHIFT_USERNAME%-observability-demo)
  +--------------------------------------------------------+
  |  frontend / backend / database / notifier              |
  |  +----------+  OTLP HTTP (localhost:4318)              |
  |  | app      |---> otc-container (sidecar collector)   |
  |  +----------+                                          |
  +--------------------------------------------------------+
         | All signals: traces + metrics + logs
         | OTLP gRPC (cluster DNS, port 4317)
         v
  central-collector-collector.observability-demo.svc:4317
  +------------------------------------------------------------------+
  |  Central collector – deployment mode, 2 replicas                 |
  |  (namespace: observability-demo)                                 |
  |                                                                  |
  |  Signal routing:                                                 |
  |  traces  --> otlp/tempo     --> TempoStack distributor :4317     |
  |  traces  --> spanmetrics    --> traces/spanmetrics pipeline       |
  |  metrics --> prometheusremotewrite                               |
  |              --> COO Prometheus /api/v1/write :9090              |
  |  logs    --> otlphttp/logs                                       |
  |              --> LokiStack gateway :8080 (application tenant)    |
  +------------------------------------------------------------------+

Why two tiers?

Sidecar collector: Runs within the application pod as a second container (otc-container). The app sends to localhost:4318 (no network hop). The sidecar enriches every span, metric data point, and log record with Kubernetes metadata (k8s.pod.name, k8s.deployment.name, k8s.namespace.name) via the k8sattributes processor, then forwards all three signals over a single gRPC connection to the central collector.
Central collector: Runs as a shared Deployment (2 replicas) in observability-demo. It receives all signals from every user’s sidecar and routes them to different backends using different protocols and auth mechanisms. It also runs the spanmetrics connector, which generates RED metrics directly from incoming trace spans.

This pattern keeps the sidecar simple (no secrets, no TLS config, no auth tokens) while centralising the complex backend integrations in one place.

Processors in the sidecar collector

The sidecar uses four processors in order:

Processor Function

Processor	Function
`memory_limiter`	Prevents the collector from consuming more than 75% of available pod memory
`resourcedetection`	Detects OpenShift infrastructure attributes (such as `k8s.cluster.name`) and adds them to all telemetry
`k8sattributes`	Calls the Kubernetes API to attach pod name, namespace, deployment name, node name, and pod UID to every span, metric, and log record
`batch`	Accumulates records before sending to reduce network overhead

memory_limiter

Prevents the collector from consuming more than 75% of available pod memory

resourcedetection

Detects OpenShift infrastructure attributes (such as k8s.cluster.name) and adds them to all telemetry

k8sattributes

Calls the Kubernetes API to attach pod name, namespace, deployment name, node name, and pod UID to every span, metric, and log record

batch

Accumulates records before sending to reduce network overhead

Span metrics connector in the central collector

The central collector uses a spanmetrics connector to generate RED metrics automatically from incoming traces:

Rate: traces_spanmetrics_calls_total — request count per service and operation
Errors: traces_spanmetrics_calls_total{status.code="STATUS_CODE_ERROR"} — error count
Duration: traces_spanmetrics_latency_bucket — latency histogram with configurable buckets

These metrics are published to the COO-managed Prometheus instance via prometheusremotewrite. The COO MonitoringStack has enableRemoteWriteReceiver: true set, which activates the /api/v1/write endpoint that Prometheus exposes for ingest.

Exercise 1: Verify Tempo deployment

The Tempo distributed tracing stack was deployed via GitOps as part of the workshop infrastructure. Verify that all components are running and ready.

Steps

Verify the Tempo Operator is running:

oc get pods -n openshift-tempo-operator

Expected output

NAME                                          READY   STATUS    RESTARTS   AGE
tempo-operator-controller-manager-xxxxx       2/2     Running   0          1h

Check the TempoStack instance:
```
oc get tempostack -n openshift-tempo-operator
```
Expected output
```
NAME    AGE   CONDITION
tempo   1h    Ready
```
The CONDITION must be Ready before traces can be ingested.

Verify each Tempo component is running:

oc get pods -n openshift-tempo-operator -l app.kubernetes.io/instance=tempo

Expected output

NAME                                        READY   STATUS    RESTARTS   AGE
tempo-tempo-compactor-xxxxx                 1/1     Running   0          1h
tempo-tempo-distributor-xxxxx               1/1     Running   0          1h
tempo-tempo-ingester-0                      1/1     Running   0          1h
tempo-tempo-querier-xxxxx                   1/1     Running   0          1h
tempo-tempo-query-frontend-xxxxx            1/1     Running   0          1h

Confirm the distributor service endpoint (used by the OpenTelemetry Collector to write traces):
```
oc get svc tempo-tempo-distributor -n openshift-tempo-operator
```
Expected output
```
NAME                      TYPE        CLUSTER-IP      PORT(S)
tempo-tempo-distributor   ClusterIP   172.30.x.x      4317/TCP, 4318/TCP
```
Port 4317 is OTLP gRPC and port 4318 is OTLP HTTP. The central OpenTelemetry Collector forwards traces to this endpoint.

Verify

Check that your tracing infrastructure is operational:

✓ Tempo Operator pod is Running (2/2 containers)
✓ TempoStack instance condition is Ready
✓ All Tempo component pods (distributor, ingester, querier, query-frontend, compactor) are Running
✓ Tempo distributor service exposes ports 4317 and 4318

What you learned: The TempoStack operator deploys and manages all Tempo components. The distributor is the write endpoint; the query-frontend is the read endpoint used by the OpenShift console UI plugin.

Exercise 2: Explore the application’s OpenTelemetry instrumentation

The three Go services—frontend, backend, and database—are already instrumented with the OpenTelemetry SDK. Telemetry generation is gated by an environment variable so it can be enabled without rebuilding the container image.

Steps

Inspect the shared telemetry package:

The repository contains a shared telemetry package used by all three services at src/telemetry/telemetry.go.

Open the Source Code tab in the workshop application (the running frontend) and navigate to telemetry/telemetry.go to browse the file with syntax highlighting.

Key points in this package:
- Enabled() function: Returns true only when OTEL_ENABLED=true is set in the environment. All SDK initialization is skipped when false, so the application behaves identically to an uninstrumented binary.
- Setup() function: Initializes three OTLP HTTP exporters when enabled—trace, metric, and log—all targeting OTEL_EXPORTER_OTLP_ENDPOINT. Once telemetry is enabled in Exercise 6, this will point to http://localhost:4318 (the injected sidecar collector).

Inspect the backend service instrumentation:

Open the Source Code tab in the workshop application and select backend/main.go to browse the file directly.

You will see three instrumentation layers activated when OTEL_ENABLED=true:

Layer Purpose

Layer	Purpose
`telemetry.Setup()`	Creates the global TraceProvider, MeterProvider, and LoggerProvider from the OTLP exporters
`otelslog.NewHandler()`	Bridges the Go standard `slog` logger to the OTel log exporter—every structured log line is emitted as an OTel log record carrying the active trace ID
`otelhttp.NewTransport()`	Wraps the outbound HTTP client so the W3C `traceparent` header is injected into every downstream call

telemetry.Setup()

Creates the global TraceProvider, MeterProvider, and LoggerProvider from the OTLP exporters

otelslog.NewHandler()

Bridges the Go standard slog logger to the OTel log exporter—every structured log line is emitted as an OTel log record carrying the active trace ID

otelhttp.NewTransport()

Wraps the outbound HTTP client so the W3C traceparent header is injected into every downstream call

Understand the server-side instrumentation:

The inbound HTTP handler is wrapped with otelhttp.NewHandler(), which:
- Creates a server span for every inbound request
- Extracts the incoming traceparent header and registers this span as a child of the calling service’s span
- Automatically records HTTP attributes (http.method, http.route, http.status_code) on the span
Understand context propagation across services:

The trace context flows automatically through your microservices:
```
Browser
  +-- frontend (otelhttp server span: GET /new-note)
       |  injects traceparent into outbound request
       +-- backend (otelhttp server span: POST /api/notes)
            |  injects traceparent into outbound request
            +-- database (otelhttp server span: POST /api/events)
```
Because every service uses otelhttp for both inbound (handler) and outbound (transport) HTTP calls, the trace ID and parent span ID are automatically threaded through the entire call chain with no manual span creation required in business logic code.
Review the enable-otel.yaml patch file:

Open the Source Code tab in the workshop application and select enable-otel.yaml to browse the full patch file.

Or view it in the terminal:

This file contains three Deployment patches—one per service. When applied, each patch:
- Sets OTEL_ENABLED=true to activate the SDK
- Sets OTEL_SERVICE_NAME to the service name (becomes the service.name resource attribute)
- Sets OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 to send telemetry to the injected sidecar
- Adds the pod annotation sidecar.opentelemetry.io/inject: "sidecar" to trigger sidecar injection
- Sets serviceAccountName: otel-collector-sidecar for RBAC access to the Kubernetes API

Verify the current state of the deployments:

oc get deployment frontend backend \
  -n %OPENSHIFT_USERNAME%-observability-demo \
  -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.template.spec.containers[*].name'
oc get statefulset database \
  -n %OPENSHIFT_USERNAME%-observability-demo \
  -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.template.spec.containers[*].name'

Expected output

NAME       CONTAINERS
frontend   frontend
backend    backend
NAME       CONTAINERS
database   database

Each service currently has a single application container. After enabling OpenTelemetry later in this module, each pod will gain a second container—the injected sidecar collector.

Verify

✓ src/telemetry/telemetry.go gates all SDK initialization on OTEL_ENABLED=true
✓ otelhttp.NewHandler() creates server spans and extracts incoming trace context
✓ otelhttp.NewTransport() injects the traceparent header into all outbound HTTP calls
✓ otelslog.NewHandler() bridges Go structured logs to OTel log records
✓ Current deployments have one container each (no OTel yet)

What you learned: Effective OpenTelemetry Go instrumentation uses three layers—trace provider, otelhttp middleware, and log bridge—to emit traces, metrics, and correlated logs from a single SDK setup call. The otelhttp transport ensures W3C trace context propagation across all service boundaries automatically, without any manual span creation in application code.

Exercise 3: Verify the OpenTelemetry operator and pre-deployed components

Before creating resources in your namespace, confirm the OpenTelemetry Operator and the shared observability-demo infrastructure are healthy.

Steps

Verify the OpenTelemetry Operator pod is running:

oc get pods -n openshift-operators \
  -l app.kubernetes.io/name=opentelemetry-operator

Expected output

NAME                                          READY   STATUS    RESTARTS   AGE
opentelemetry-operator-controller-xxxxx       2/2     Running   0          1h

Confirm the operator registered the CRDs:

oc api-resources | grep opentelemetry

Expected output

instrumentations              opentelemetry.io   true    Instrumentation
opentelemetrycollectors       opentelemetry.io   true    OpenTelemetryCollector

Inspect the pre-deployed central collector:

oc get opentelemetrycollector central-collector -n observability-demo

Expected output

NAME               MODE         VERSION
central-collector  deployment   0.140.0-2

Verify the central collector pods are running:

oc get pods -n observability-demo \
  -l app.kubernetes.io/name=central-collector-collector

Expected output

NAME                                      READY   STATUS    RESTARTS   AGE
central-collector-collector-xxxxx         1/1     Running   0          1h
central-collector-collector-xxxxx         1/1     Running   0          1h

Two replicas provide resilience for the shared collection endpoint.

Inspect the central collector service (this is where sidecars forward telemetry):

oc get svc central-collector-collector -n observability-demo

Expected output

NAME                           TYPE        CLUSTER-IP      PORT(S)
central-collector-collector    ClusterIP   172.30.x.x      4317/TCP, 4318/TCP

Verify

✓ OpenTelemetry Operator pod is Running (2/2)
✓ OpenTelemetryCollector and Instrumentation CRDs are registered
✓ central-collector exists in observability-demo in deployment mode with 2 replicas
✓ central-collector-collector service exposes ports 4317 and 4318

What you learned: The central collector is pre-deployed in the shared observability-demo namespace by GitOps. Your task is to create the per-namespace sidecar collector and Instrumentation CR in your own namespace, then wire the application into that pipeline.

Exercise 4: Create the sidecar collector in your namespace

You will deploy a sidecar-mode OpenTelemetryCollector CR in your %OPENSHIFT_USERNAME%-observability-demo namespace. When this CR exists, the OpenTelemetry Operator automatically injects a sidecar container into any pod in the namespace that carries the annotation sidecar.opentelemetry.io/inject: "sidecar".

Steps

Verify the required ServiceAccount is present in your namespace:
```
oc get serviceaccount otel-collector-sidecar \
  -n %OPENSHIFT_USERNAME%-observability-demo
```
This ServiceAccount was pre-created for your namespace with the RBAC permissions needed by the k8sattributes and resourcedetection processors (read access to pods, namespaces, and nodes).
Expected output
```
NAME                    SECRETS   AGE
otel-collector-sidecar  0         1h
```

Create the sidecar OpenTelemetryCollector CR:

cat <<EOF | oc apply -f -
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: sidecar
  namespace: %OPENSHIFT_USERNAME%-observability-demo
spec:
  mode: sidecar
  serviceAccount: otel-collector-sidecar
  config:
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 15
      resourcedetection:
        detectors: [openshift]
        timeout: 2s
      k8sattributes:
        auth_type: serviceAccount
        passthrough: false
        extract:
          metadata:
            - k8s.namespace.name
            - k8s.deployment.name
            - k8s.node.name
            - k8s.pod.name
            - k8s.pod.uid
      batch:
        timeout: 10s
        send_batch_size: 1024
    exporters:
      otlp_grpc:
        endpoint: central-collector-collector.observability-demo.svc:4317
        tls:
          insecure: true
        sending_queue:
          enabled: true
          queue_size: 5000
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_interval: 30s
          max_elapsed_time: 10m
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, resourcedetection, k8sattributes, batch]
          exporters: [otlp_grpc]
        metrics:
          receivers: [otlp]
          processors: [memory_limiter, resourcedetection, k8sattributes, batch]
          exporters: [otlp_grpc]
        logs:
          receivers: [otlp]
          processors: [memory_limiter, resourcedetection, k8sattributes, batch]
          exporters: [otlp_grpc]
EOF

Verify the CR was accepted:
```
oc get opentelemetrycollector sidecar -n %OPENSHIFT_USERNAME%-observability-demo
```
Expected output
```
NAME     MODE    VERSION
sidecar  sidecar 0.140.0-2
```
In sidecar mode, the operator does not create a standalone Deployment. Instead it stores the container spec and injects it into pods on demand when the annotation is detected.

Understand the pipeline

The sidecar carries all three signal types over the same processor chain:

              otlp receiver  (localhost:4317 gRPC / localhost:4318 HTTP)
                    |
         memory_limiter  -- drop if pod memory > 75%
                    |
         resourcedetection -- add k8s.cluster.name, cloud.platform
                    |
         k8sattributes   -- add k8s.pod.name, k8s.deployment.name,
                            k8s.namespace.name, k8s.pod.uid
                    |
              batch        -- buffer and flush (max 10s / 1024 records)
                    |
            otlp_grpc exporter -> central-collector-collector.observability-demo.svc:4317
                      (gRPC, insecure + retry queue for resiliency)

The sidecar does not route signals to different destinations—that responsibility belongs to the central collector. All signals arrive at the central on a single gRPC stream.

Verify

✓ otel-collector-sidecar ServiceAccount exists in %OPENSHIFT_USERNAME%-observability-demo
✓ sidecar OpenTelemetryCollector CR exists in sidecar mode
✓ CR configuration includes all three pipelines (traces, metrics, logs)
✓ Exporter is otlp_grpc with queue/retry enabled and endpoint central-collector-collector.observability-demo.svc:4317

Exercise 5: Create the Instrumentation CR in your namespace

The Instrumentation CR is a template that tells the OpenTelemetry Operator how to configure the auto-instrumentation agent init-container when injecting into pods. You need one per namespace. In this exercise you create it for %OPENSHIFT_USERNAME%-observability-demo—it will be used in Exercise 9 when you enable Python auto-instrumentation on the notifier service.

Steps

Create the Instrumentation CR:
```
cat <<EOF | oc apply -f -
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
  namespace: %OPENSHIFT_USERNAME%-observability-demo
spec:
  exporter:
    endpoint: http://localhost:4318
  sampler:
    type: parentbased_traceidratio
    argument: "1.0"
  propagators:
    - tracecontext
    - baggage
  python:
    env:
      - name: OTEL_EXPORTER_OTLP_PROTOCOL
        value: http/protobuf
EOF
```
Key fields explained:
- spec.exporter.endpoint: http://localhost:4318 — the auto-instrumented process sends telemetry to the sidecar on localhost (both containers live in the same pod)
- spec.sampler.type: parentbased_traceidratio with argument: "1.0" — 100% sampling, suitable for a workshop
- spec.propagators: [tracecontext, baggage] — W3C Trace Context headers so the incoming traceparent from the Go backend is read and spans are linked into the existing trace
- spec.python.env OTEL_EXPORTER_OTLP_PROTOCOL: http/protobuf — forces the Python SDK to use HTTP/protobuf rather than gRPC, matching the sidecar receiver on port 4318

Verify the CR was accepted:

oc get instrumentation my-instrumentation -n %OPENSHIFT_USERNAME%-observability-demo

Expected output

NAME                 AGE
my-instrumentation   10s

Verify

✓ my-instrumentation Instrumentation CR exists in %OPENSHIFT_USERNAME%-observability-demo
✓ spec.exporter.endpoint is http://localhost:4318

What you learned: The Instrumentation CR is a per-namespace configuration template. Unlike the sidecar CR (which provides a container spec merged into application pods), the Instrumentation CR is referenced by pods via the instrumentation.opentelemetry.io/inject-<language> annotation — the Operator reads it and injects a language-specific agent init-container plus the necessary environment variables with no application changes required.

Exercise 6: Enable OpenTelemetry on the applications

Now you’ll activate the OpenTelemetry SDK in all three Go services. This adds the sidecar injection annotation, the SDK environment variables, and the correct ServiceAccount to each deployment without rebuilding container images.

Steps

Set your namespace as a shell variable:

NAMESPACE="%OPENSHIFT_USERNAME%-observability-demo"

Add the OTEL environment variables to all three services:

for app in frontend backend; do
  oc set env deployment/${app} -n ${NAMESPACE} \
    OTEL_ENABLED=true \
    OTEL_SERVICE_NAME=${app} \
    OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
    OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
done
oc set env statefulset/database -n ${NAMESPACE} \
  OTEL_ENABLED=true \
  OTEL_SERVICE_NAME=database \
  OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
  OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

Expected output

deployment.apps/frontend updated
deployment.apps/backend updated
statefulset.apps/database updated

Add the sidecar injection annotation to each pod template:

for app in frontend backend; do
  oc patch deployment/${app} -n ${NAMESPACE} \
    --type=strategic \
    -p='{"spec":{"template":{"metadata":{"annotations":{"sidecar.opentelemetry.io/inject":"sidecar"}}}}}'
done
oc patch statefulset/database -n ${NAMESPACE} \
  --type=strategic \
  -p='{"spec":{"template":{"metadata":{"annotations":{"sidecar.opentelemetry.io/inject":"sidecar"}}}}}'

The annotation value sidecar must match the name of the OpenTelemetryCollector CR you created in Exercise 4. When the operator sees this annotation on a pod being created, it injects the collector container spec from that CR.

Expected output

deployment.apps/frontend patched
deployment.apps/backend patched
statefulset.apps/database patched

Set the ServiceAccount on all three services so the injected sidecar can call the Kubernetes API:

for app in frontend backend; do
  oc set serviceaccount deployment/${app} \
    otel-collector-sidecar \
    -n ${NAMESPACE}
done
oc set serviceaccount statefulset/database \
  otel-collector-sidecar \
  -n ${NAMESPACE}

Expected output

deployment.apps/frontend serviceaccount updated
deployment.apps/backend serviceaccount updated
statefulset.apps/database serviceaccount updated

Monitor the rolling restart:

database is a StatefulSet with a ReadWriteOnce PVC. Its default RollingUpdate strategy terminates the existing pod before starting the replacement, so the volume is cleanly released. Expect a few seconds of database unavailability during this step.

oc rollout status deployment/frontend deployment/backend \
  -n ${NAMESPACE}
oc rollout status statefulset/database \
  -n ${NAMESPACE}

Expected output

deployment "frontend" successfully rolled out
deployment "backend" successfully rolled out
statefulset rolling to 1 pods at revision database-xxxxx
waiting for statefulset rolling update to complete 0 pods at revision database-xxxxx...
statefulset rolling update complete 1 pods at revision database-xxxxx...

Verify sidecar injection occurred:

oc get pods -n ${NAMESPACE} \
  -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.containers[*].name'

Expected output

NAME                        CONTAINERS
frontend-xxxxx              frontend, otc-container
backend-xxxxx               backend, otc-container
database-0                  database, otc-container

The otc-container is the injected OpenTelemetry Collector sidecar. Each pod now has two containers: the application and its dedicated collector.

Confirm the sidecar container is running and pipelines have started:

oc logs -n ${NAMESPACE} \
  -l app=frontend -c otc-container | tail -10

Expected output (excerpt)

Everything is ready. Begin running and processing data.
Pipeline started (traces).
Pipeline started (metrics).
Pipeline started (logs).

Generate application traffic to produce telemetry:

FRONTEND_URL=$(oc get route frontend \
  -n ${NAMESPACE} \
  -o jsonpath='{.spec.host}')

for i in $(seq 1 30); do
  curl -sk -o /dev/null "https://$FRONTEND_URL/"
  curl -sk -o /dev/null -X POST "https://$FRONTEND_URL/api/notes" \
    -H "Content-Type: application/json" \
    -d "{\"title\":\"Test $i\",\"content\":\"OTel test\"}"
  sleep 1
done
echo "Traffic generation complete"

This sends 30 iterations of mixed GET and POST requests, generating spans at each hop through frontend → backend → database.

Verify

✓ OTEL environment variables set on all three deployments
✓ Sidecar injection annotation present on all three pod templates
✓ All three deployments rolled out successfully
✓ Each pod has two containers: the application and otc-container (sidecar)
✓ Sidecar logs show all three pipelines started
✓ Traffic generated against the frontend route

What you learned: The OpenTelemetry Operator’s sidecar injection mechanism transforms a running pod by adding a pre-configured collector container, without rebuilding the image. The annotation sidecar.opentelemetry.io/inject: "sidecar" tells the operator to use the sidecar CR you created in Exercise 4 as the container specification.

Exercise 7: View and explore live traces

With traffic flowing and telemetry active, you can now view real traces from your application in the OpenShift console and use TraceQL to query them.

Steps

Navigate to Observe → Traces in the OpenShift console.
Set the search parameters:
- Namespace: %OPENSHIFT_USERNAME%-observability-demo
- Time range: Last 5 minutes
  
  Click Search or the refresh icon.
  
  The scatter plot should now show individual points, one per trace, with y-axis showing duration in milliseconds.
Review the scatter plot and trace list:

Each point in the scatter plot represents a single trace:
- X-axis: trace start time
- Y-axis: total trace duration (ms)
- Bubble size: number of spans in the trace
  
  Clusters of points at the top of the chart indicate slow traces. The trace list below shows:
- Trace name: root span operation (for example, frontend: GET /)
- Spans: total number of spans in the trace
- Duration: end-to-end time
- Start time: when the root span began
  
  Click one of the higher points to open the trace detail view.
Explore the trace waterfall:

The waterfall shows a horizontal bar for each span, indented to reflect parent-child relationships. The otelhttp library creates two spans per service boundary: a server span named after the matched route (POST /api/notes) and a client span named after the HTTP method (HTTP POST) for each outbound call:
```
frontend: POST /api/notes            [=====================================] 134ms
  frontend: HTTP POST                [===================================] 130ms
    backend: POST /api/notes         [========================] 100ms
      backend: HTTP POST             [================] 56ms
        database: POST /notes        [===============] 55ms
      backend: HTTP POST             [========] 25ms
```
Bar length represents duration. The server span for each service is the parent of that service’s outgoing client spans.

Click a span to expand its attributes:

The attributes on any span come from three distinct layers—the application itself, the Go OTel SDK semantic conventions, and the sidecar collector processors.

Frontend span (frontend: POST /api/notes)

Attribute Value / description

Attribute	Value / description
`http.method`	`GET`
`http.status_code`	`200`
`http.route`	Matched route pattern (for example, `/`)
`baggage.client.platform`	`web` — set by the frontend baggage middleware and propagated to every downstream span via the W3C `baggage` header
`baggage.request.source`	`workshop-demo` — identifies the request origin; visible on all three service spans in this trace

http.method

GET

http.status_code

200

http.route

Matched route pattern (for example, /)

baggage.client.platform

web — set by the frontend baggage middleware and propagated to every downstream span via the W3C baggage header

baggage.request.source

workshop-demo — identifies the request origin; visible on all three service spans in this trace

Backend span (backend: POST /api/notes)

Attribute Value / description

Attribute	Value / description
`http.method`	`POST`
`http.status_code`	`200`
`db.system`	`chainsql` — identifies the downstream data store
`db.operation`	`INSERT` — derived from the HTTP method (`POST → INSERT`, `GET → SELECT`)
`db.sql.table`	`notes` — extracted from the request path
`peer.service`	`database` — the logical name of the service called
`net.peer.name`	`database` — DNS hostname of the database service
`db.response.status_code`	HTTP status returned by the database service
`baggage.client.platform`	`web` — forwarded from the W3C `baggage` HTTP header
`baggage.request.source`	`workshop-demo` — forwarded from the W3C `baggage` HTTP header

http.method

POST

http.status_code

200

db.system

chainsql — identifies the downstream data store

db.operation

INSERT — derived from the HTTP method (POST → INSERT, GET → SELECT)

db.sql.table

notes — extracted from the request path

peer.service

database — the logical name of the service called

net.peer.name

database — DNS hostname of the database service

db.response.status_code

HTTP status returned by the database service

baggage.client.platform

web — forwarded from the W3C baggage HTTP header

baggage.request.source

workshop-demo — forwarded from the W3C baggage HTTP header

Database span (database: POST /notes)

Attribute Value / description

Attribute	Value / description
`db.system`	`chainsql`
`db.operation`	`INSERT`
`db.sql.table`	`notes`
`note.id`	Unique ID assigned to the created note record
`note.title`	Title string from the request body
`note.content.length`	Character length of the note content
`event.id`	ID of the audit event recorded alongside the note
`event.source`	Service name that triggered the event
`event.http_status`	HTTP status of the original request that created the event

db.system

chainsql

db.operation

INSERT

db.sql.table

notes

note.id

Unique ID assigned to the created note record

note.title

Title string from the request body

note.content.length

Character length of the note content

event.id

ID of the audit event recorded alongside the note

event.source

Service name that triggered the event

event.http_status

HTTP status of the original request that created the event

Sidecar-added attributes (present on every span regardless of service):

Kubernetes attributes (k8sattributes processor): k8s.pod.name, k8s.deployment.name, k8s.namespace.name
Resource attributes (resourcedetection processor): cloud.platform, k8s.cluster.name

Observe W3C Baggage propagation:

Baggage is a key-value store carried inside the baggage HTTP header alongside traceparent. The frontend sets two members before every outbound request:

baggage: client.platform=web,request.source=workshop-demo

Each downstream service reads this header and records every baggage member as a span attribute prefixed with baggage.. This means the same logical context—where the request came from—is searchable as a span attribute on every service in the chain.

All TraceQL queries below include { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" } to scope results to your namespace. The Tempo backend is shared across all workshop users, so this filter is required to see only your traces.

Filter all traces that originated from the web frontend:

{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["baggage.client.platform"] = "web" }

Or find every span produced by the workshop demo requests, regardless of service:

{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["baggage.request.source"] = "workshop-demo" }

Use TraceQL to find slow database spans:

Click Show query beneath the filter bar to reveal the TraceQL editor. TraceQL is the Tempo query language, similar to PromQL for Prometheus or LogQL for Loki.
```
{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && resource.service.name = "database" && duration > 50ms }
```
This filters to only traces in your namespace where the database service had at least one span exceeding 50ms. Because random delays of up to 60ms are injected in the backend and database services, you should see several results.

Query by the specific table the slow operation touched:
```
{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["db.sql.table"] = "notes" && duration > 30ms }
```
Or filter by database operation type:
```
{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["db.operation"] = "INSERT" && resource.service.name = "database" }
```
Use TraceQL to surface business-level data:

The custom note. and event. attributes open the trace store as a queryable record of application events. Find all traces that created a note:
```
{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["note.id"] != "" }
```
Or find traces by peer service to understand the backend-to-database communication pattern:
```
{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["peer.service"] = "database" && span["db.sql.table"] = "notes" }
```
Find error traces:
```
{ .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && status = error }
```
If any requests returned HTTP errors, this will surface the traces where something went wrong, with all spans visible for root cause analysis.
Correlate a trace with logs:

Note the Trace ID shown at the top of a trace detail view (a 32-character hex string).

Navigate to Observe → Logs.

Query for log lines containing that trace ID:
```
{kubernetes_namespace_name="%OPENSHIFT_USERNAME%-observability-demo"} |= "<your-trace-id>"
```
Because the application uses otelslog.NewHandler(), every structured log line emitted during a traced request carries the active trace ID as a log field. This lets you move directly from a slow span to the exact log lines emitted during that span.

Verify

✓ Traces appear in Observe → Traces for namespace %OPENSHIFT_USERNAME%-observability-demo
✓ Trace waterfall shows spans from all three services (frontend, backend, database)
✓ Frontend spans carry baggage.client.platform=web and baggage.request.source=workshop-demo
✓ Backend spans carry db.system, db.operation, db.sql.table, peer.service, db.response.status_code
✓ Database spans carry note.id, note.title, note.content.length on note-creation requests
✓ All spans include Kubernetes metadata (k8s.pod.name, k8s.deployment.name)
✓ TraceQL query { .k8s.namespace.name = "%OPENSHIFT_USERNAME%-observability-demo" && span["baggage.request.source"] = "workshop-demo" } returns results
✓ A trace ID from the trace view can be found in Observe → Logs

What you learned: The workshop application uses two complementary instrumentation strategies. The sidecar collector processors automatically enrich spans with Kubernetes metadata—no application code required. The Go OTel SDK adds semantic-convention attributes (db., peer., net.) for production-grade observability, business-level attributes (note., event.*) for application-specific queries, and W3C Baggage to propagate contextual metadata across all service boundaries. TraceQL lets you query across infrastructure, application, and business dimensions using a single query language.

Exercise 8: Explore the central collector pipeline

The central collector in observability-demo receives telemetry from all sidecar collectors and routes it to three backends. Inspect the configuration and verify each export path.

Steps

View the central collector configuration:
```
oc get opentelemetrycollector central-collector \
  -n observability-demo \
  -o jsonpath='{.spec.config}' | yq .
```
Note the four export destinations:
- otlp/tempo → tempo-tempo-distributor.openshift-tempo-operator.svc:4317 — traces, TLS + bearer token auth + X-Scope-OrgID: dev header
- prometheusremotewrite → COO Prometheus /api/v1/write — application metrics pushed from pods
- prometheusremotewrite (via traces/spanmetrics pipeline) → same endpoint — span-derived RED metrics
- otlphttp/logs → LokiStack gateway at openshift-logging — logs, OTLP/HTTP JSON + bearer token + service CA TLS
Inspect the spanmetrics connector:
```
oc get opentelemetrycollector central-collector \
  -n observability-demo \
  -o jsonpath='{.spec.config}' | yq '.connectors.spanmetrics'
```
The spanmetrics connector acts as both an exporter (receiving spans from the traces pipeline) and a receiver (producing metric records for the traces/spanmetrics pipeline). It generates latency histograms and call-count counters for every service.name + span.name combination automatically.

Query the generated span metrics in Prometheus:

Navigate to Observe → Metrics and enter:

sum(rate(traces_spanmetrics_calls_total{service_name="frontend"}[5m])) by (span_name)

This shows the request rate per operation for the frontend service—derived solely from traces, with no Prometheus client library code in the application.

To see the p95 latency for backend operations:

histogram_quantile(0.95,
  sum(rate(traces_span_metrics_duration_milliseconds_bucket{k8s_namespace_name="%OPENSHIFT_USERNAME%-observability-demo", k8s_deployment_name="backend"}[5m]))
  by (span_name, le)
)

Understand the logs pipeline:

The central collector runs a dedicated logs pipeline that maps OTEL resource attributes to the label keys expected by the LokiStack openshift-logging multi-tenancy gateway:

processors:
  resource/logs:            (1)
    attributes:
      - {key: kubernetes.namespace_name, from_attribute: k8s.namespace.name, action: upsert}
      - {key: kubernetes.pod_name,       from_attribute: k8s.pod.name,       action: upsert}
      - {key: kubernetes.container_name, from_attribute: k8s.container.name, action: upsert}
      - {key: log_type, value: application, action: upsert}
  transform/logs:           (2)
    log_statements:
      - context: log
        statements:
          - set(attributes["level"], ConvertCase(severity_text, "lower"))
exporters:
  otlphttp/logs:            (3)
    endpoint: https://logging-loki-gateway-http.openshift-logging.svc.cluster.local:8080/api/logs/v1/application/otlp
    encoding: json
    tls:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    auth:
      authenticator: bearertokenauth  (4)

1	LokiStack in `openshift-logging` tenancy mode requires `kubernetes.namespace_name` (not `k8s.namespace.name`) for tenant routing.
2	Derives a `level` field from OTEL `severity_text` (lowercased) for log filtering in the Loki UI.
3	Sends log records as OTLP/HTTP JSON to the LokiStack gateway’s application tenant endpoint.
4	Uses the pod’s service account token for bearer auth—the central collector SA has `loki.grafana.com/application` create rights via ClusterRoleBinding. Verify a log record arrived in Loki by navigating to Observe → Logs in the OpenShift console and querying: `{kubernetes_namespace_name="%OPENSHIFT_USERNAME%-observability-demo"} \| json` You should see structured log records from `frontend`, `backend`, and `database`, each containing `traceID` and `spanID` fields that link them to the traces you viewed in Exercise 7.

Verify

✓ Central collector configuration shows otlp/tempo, prometheusremotewrite, and otlphttp/logs exporters
✓ spanmetrics connector configuration is visible
✓ traces_spanmetrics_calls_total metric exists in Prometheus for your services
✓ traces_spanmetrics_latency_bucket histogram is queryable for p95 latency
✓ Observe → Logs shows structured log records from %OPENSHIFT_USERNAME%-observability-demo with traceID fields

What you learned: The central collector is a fanout hub—one OTLP receiver, four exporters. Each signal type is independently processed and routed: traces go to Tempo (with TLS and multi-tenancy headers), metrics are remote-written to COO Prometheus, span-derived RED metrics follow the same path via the spanmetrics connector, and logs are attribute-remapped and forwarded to the LokiStack application tenant using service-account bearer auth.

Exercise 9: Zero-code Python auto-instrumentation

The workshop application includes a fourth service: notifier, a Python/FastAPI microservice. The backend calls notifier after every note create, update, or delete operation. The notifier records the event in the database service.

Open src/notifier/app.py and notice what is absent: there are no OpenTelemetry imports of any kind. The file contains only FastAPI route handlers and an httpx call. Yet by the end of this exercise, full traces—including spans for every notifier HTTP call—will appear in the trace waterfall.

Open the Source Code tab in the workshop application and select notifier/app.py to compare it with the Go services.

This demonstrates the difference between the manual SDK approach used by the Go services and the zero-code auto-instrumentation provided by the OpenTelemetry Operator.

Service Language Instrumentation method

Service	Language	Instrumentation method
frontend	Go	Manual SDK (`telemetry.Setup()` + `otelhttp`)
backend	Go	Manual SDK (`telemetry.Setup()` + `otelhttp`)
database	Go	Manual SDK (`telemetry.Setup()` + `otelhttp`)
notifier	Python	Zero-code: OTel Operator init-container injection

frontend

Manual SDK (telemetry.Setup() + otelhttp)

backend

Manual SDK (telemetry.Setup() + otelhttp)

database

Manual SDK (telemetry.Setup() + otelhttp)

notifier

Python

Zero-code: OTel Operator init-container injection

Step 1: Observe the missing notifier span

Before enabling auto-instrumentation, generate traffic and look at a trace waterfall.

Generate several note-creation requests:

FRONTEND_URL=$(oc get route frontend \
  -n %OPENSHIFT_USERNAME%-observability-demo \
  -o jsonpath='{.spec.host}')

for i in $(seq 1 10); do
  curl -sk -o /dev/null -X POST "https://$FRONTEND_URL/api/notes" \
    -H "Content-Type: application/json" \
    -d "{\"title\":\"Auto-instrumentation test $i\",\"content\":\"Exercise 9\"}"
  sleep 1
done

Open Observe → Traces in the OpenShift console, set namespace to %OPENSHIFT_USERNAME%-observability-demo, and click a recent trace.

The waterfall will show three hops: frontend → backend → database. The backend called notifier, but no notifier span appears because the Python process emits no telemetry without an agent.

Step 2: Annotate the notifier deployment

A single oc patch command adds the two annotations that trigger both sidecar injection (same as the Go services) and Python agent injection.

Patch the notifier deployment:
```
oc patch deployment notifier \
  -n %OPENSHIFT_USERNAME%-observability-demo \
  --type=json \
  -p='[
    {
      "op": "add",
      "path": "/spec/template/metadata/annotations",
      "value": {
        "sidecar.opentelemetry.io/inject": "sidecar",
        "instrumentation.opentelemetry.io/inject-python": "my-instrumentation"
      }
    },
    {
      "op": "add",
      "path": "/spec/template/spec/serviceAccountName",
      "value": "otel-collector-sidecar"
    }
  ]'
```
The annotation value my-instrumentation references the Instrumentation CR you created in Exercise 5 in the same namespace.

When this annotated pod is scheduled, the OpenTelemetry Operator’s mutating admission webhook injects an init container that downloads opentelemetry-distro and opentelemetry-instrumentation-fastapi into a shared volume. The Python process picks them up via PYTHONPATH and a configurator hook—no code changes or image rebuilds required.

Watch the rollout:

oc rollout status deployment/notifier -n %OPENSHIFT_USERNAME%-observability-demo

Confirm the pod now has two containers (application + sidecar):

oc get pods -n %OPENSHIFT_USERNAME%-observability-demo \
  -l app=notifier \
  -o custom-columns='NAME:.metadata.name,CONTAINERS:.spec.containers[*].name'

Expected output

NAME                        CONTAINERS
notifier-xxxxx              notifier, otc-container

Verify the Python SDK is active by checking notifier logs:
```
oc logs -n %OPENSHIFT_USERNAME%-observability-demo \
  -l app=notifier -c notifier | head -20
```
Look for OpenTelemetry bootstrap messages such as Instrumenting FastAPI or OpenTelemetry SDK configured.

Step 3: Generate traffic and observe the four-hop trace

Send another batch of note-creation requests:

FRONTEND_URL=$(oc get route frontend \
  -n %OPENSHIFT_USERNAME%-observability-demo \
  -o jsonpath='{.spec.host}')

for i in $(seq 1 15); do
  curl -sk -o /dev/null -X POST "https://$FRONTEND_URL/api/notes" \
    -H "Content-Type: application/json" \
    -d "{\"title\":\"Traced note $i\",\"content\":\"With notifier\"}"
  sleep 1
done
echo "Done"

Return to Observe → Traces in the console. A note-creation trace will now show a fourth hop. The backend: HTTP POST client span that previously had no child now has a notifier server span beneath it:

frontend: POST /api/notes              [=====================================] 134ms
  frontend: HTTP POST                  [===================================] 130ms
    backend: POST /api/notes           [========================] 100ms
      backend: HTTP POST               [================] 56ms
        database: POST /notes          [===============] 55ms
      backend: HTTP POST               [========] 25ms
        notifier: POST /notify         [======] 20ms
          notifier: HTTP POST          [===] 10ms

The notifier spans are produced entirely by opentelemetry-instrumentation-fastapi and opentelemetry-instrumentation-httpx—no code was changed in app.py.

Click on the notifier: POST /notify span and inspect its attributes:
- http.method, http.status_code, http.route — added by FastAPI auto-instrumentation
- k8s.pod.name, k8s.deployment.name — added by the sidecar k8sattributes processor
- service.name: notifier — set via the OTEL_SERVICE_NAME env var injected by the operator
Notice that the traceparent context passed from backend to notifier is preserved correctly: the notifier root span appears as a child of the backend span, maintaining the single unified trace tree.

Compare: manual SDK vs auto-instrumentation

Characteristic Go (manual SDK) Python (auto-instrumentation)

Characteristic	Go (manual SDK)	Python (auto-instrumentation)
Code change required	Yes (`telemetry.Setup()`)	No
Image rebuild required	Yes	No
Activation mechanism	`OTEL_ENABLED=true` env var	Pod annotation
Span granularity	Full control (custom spans)	Framework-level (HTTP in/out)
Custom attributes	Full control — `db.`, `note.`, `event.*`, baggage	Limited without code changes
W3C Baggage	Yes — frontend injects, all services propagate	Yes — read from `traceparent`/`baggage` header automatically
Best for	Apps with source access	Apps without source access or rapid onboarding

Code change required

Yes (telemetry.Setup())

Image rebuild required

Yes

Activation mechanism

OTEL_ENABLED=true env var

Pod annotation

Span granularity

Full control (custom spans)

Framework-level (HTTP in/out)

Custom attributes

Full control — db., note., event.*, baggage

Limited without code changes

W3C Baggage

Yes — frontend injects, all services propagate

Yes — read from traceparent/baggage header automatically

Best for

Apps with source access

Apps without source access or rapid onboarding

Verify

✓ Notifier pod has two running containers: notifier and otc-container
✓ Notifier logs show OpenTelemetry SDK bootstrap messages
✓ Traces for note creation show four service hops: frontend → backend → notifier → database
✓ Notifier spans are children of the backend span (W3C Trace Context propagation works)
✓ k8s.pod.name and k8s.deployment.name attributes are present on notifier spans
✓ app.py was not modified at any point in this exercise

What you learned: The Instrumentation CR is a namespace-level template. A single annotation—instrumentation.opentelemetry.io/inject-python—causes the OpenTelemetry Operator’s admission webhook to inject an init-container that downloads and configures the Python agent at pod start. No source changes, no image rebuilds, no SDK imports. The W3C Trace Context standard ensures the notifier’s spans slot directly into the existing trace tree built by the Go services.

Learning outcomes

By completing this module, you should now understand:

✓ A trace is the complete journey of a request; a span is one operation within that trace
✓ Context propagation via W3C traceparent headers links spans across service boundaries automatically
✓ Tempo stores traces on object storage with distinct distributor (write) and query-frontend (read) components
✓ The sidecar-to-central collector pattern decouples application-facing collection from backend-facing export
✓ A sidecar OpenTelemetryCollector in sidecar mode injects a container into annotated pods without any image changes
✓ The k8sattributes and resourcedetection processors automatically enrich telemetry with Kubernetes and infrastructure metadata
✓ TraceQL filters traces by service name, operation, duration, status, and any span attribute
✓ The spanmetrics connector in the central collector generates RED metrics from traces, eliminating the need for a Prometheus client library
✓ Auto-instrumentation via the Instrumentation CR enables zero-code telemetry for Python applications
✓ A Python/FastAPI service can produce full traces—including W3C context propagation—with zero source-code changes

Business impact: You now have a complete three-signal observability pipeline for all four microservices. A single request to your application automatically produces:

A distributed trace showing the exact call path and per-service timing
RED metrics (rate, errors, duration) queryable in Prometheus—without any metrics code in the application
Structured log records correlated to the trace via trace ID—without any manual log attribute configuration

Module summary

You activated the full distributed tracing and OpenTelemetry pipeline for the workshop application—from infrastructure verification through to live trace visualization, TraceQL queries, and zero-code Python auto-instrumentation.

What you accomplished:

Verified the Tempo distributed tracing backend and understood its component architecture
Learned the two-tier sidecar-to-central OpenTelemetry Collector topology
Explored the OpenTelemetry SDK instrumentation already built into the Go services
Verified the OpenTelemetry Operator and the pre-deployed central collector in observability-demo
Created a sidecar OpenTelemetryCollector CR—carries all three signals (traces, metrics, logs) to the central collector over a single gRPC connection
Created an Instrumentation CR—used as the agent injection template for Python auto-instrumentation
Enabled the Go SDK and triggered sidecar injection on the three Go deployments
Analyzed live traces in Observe → Traces, identified span-level bottlenecks, and correlated spans with log records using TraceQL
Explored the central collector’s full routing: traces → Tempo, metrics + RED metrics → COO Prometheus remote write, logs → LokiStack application tenant
Enabled zero-code Python auto-instrumentation on the notifier service using a single pod annotation
Observed a four-hop trace (frontend → backend → notifier → database) with full context propagation across Go and Python services

Key concepts mastered:

Trace and span: A trace is a tree of spans; each span represents one service operation with timing and attributes
Context propagation: traceparent HTTP headers link spans across service boundaries using the W3C standard
TempoStack: Operator-managed, with distributor, ingester, querier, query-frontend, and compactor components
Sidecar mode: The operator injects the collector as a second container into annotated pods
k8sattributes and resourcedetection: Processors that attach Kubernetes context metadata to all telemetry
Two-tier architecture: Sidecar handles application-facing collection; central collector handles backend-facing export and routing
spanmetrics connector: Generates Prometheus-queryable RED metrics automatically from trace spans
Auto-instrumentation: Instrumentation CR + pod annotation enables agent injection for Python without code changes
Cross-language tracing: W3C Trace Context headers propagate correctly between Go (otelhttp) and Python (opentelemetry-instrumentation-httpx), forming a single unified trace tree

Continue to the conclusion to review key takeaways and next steps.