Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.emergence.ai/llms.txt

Use this file to discover all available pages before exploring further.

OpenTelemetry

All em-runtime services export traces, metrics, and logs via OpenTelemetry using the vendor-neutral OTLP protocol. The platform ships with built-in instrumentation — no application code changes are needed.

Architecture

The OTLP collector is not included in the em-runtime Helm chart. Deploy your preferred collector separately. Any OTLP-compatible backend works: Grafana Cloud, Datadog, Splunk, New Relic, Honeycomb, or self-hosted.

OpenTelemetry vs Langfuse

These two observability systems are complementary, not overlapping:
OpenTelemetryLangfuse
InstrumentsHTTP requests, DB queries, Redis, service healthLLM API calls (model, prompt, tokens, cost, quality)
Answers”Is the service healthy? Is it slow?""Is the AI producing good output? What does it cost?”
IntegrationOTLP exporter (always active)LiteLLM callbacks (when LANGFUSE_HOST is set)
StoragePromQL-compatible metrics backend / Tempo / LokiLangfuse’s own PostgreSQL + ClickHouse
VisualizationGrafanaLangfuse UI
Use both together: OTel for infrastructure health, Langfuse for LLM quality and cost. See the Langfuse setup guide for LLM-specific observability.

Configuration

Telemetry is controlled via environment variables in each service’s env block in the Helm values:
VariableDefaultDescription
OTEL_ENABLED"true"Master switch for all telemetry.
OTEL_EXPORTER_OTLP_ENDPOINT"http://otel-collector:4317"OTLP collector gRPC endpoint.
OTEL_TRACES_ENABLED"true"Enable distributed tracing.
OTEL_METRICS_ENABLED"true"Enable metrics export.
OTEL_LOGS_ENABLED"true"Enable log record export via OTLP.
OTEL_TRACE_SAMPLE_RATE"1.0"Trace sampling ratio (0.0-1.0). Default is 100%; override via Helm values for production.

Point to Your Collector

Override the endpoint in your values file:
em-runtime-governance:
  env:
    OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector.monitoring:4317"

em-runtime-assets:
  env:
    OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector.monitoring:4317"

em-runtime-utils:
  env:
    OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector.monitoring:4317"

Disable Telemetry

For test environments without a collector:
em-runtime-governance:
  env:
    OTEL_ENABLED: "false"
Individual signals can also be toggled independently.
Telemetry initialization is non-fatal. If the collector is unreachable, services log a warning and continue operating normally.

Auto-Instrumented Libraries

The following libraries are automatically instrumented with no code changes:
LibraryWhat It Captures
FastAPIInbound HTTP request spans (excludes /health)
SQLAlchemyDatabase query spans and connection metrics
RedisRedis command spans
httpxOutbound HTTP request spans (inter-service SDK calls)

Telemetry Signals

Traces

Distributed traces follow the W3C Trace Context format. Traces propagate across service boundaries automatically via httpx instrumentation. Key trace fields:
  • service.name — identifies the emitting service
  • http.method, http.url, http.status_code — HTTP span attributes
  • db.system, db.statement — database query details
  • trace_id — correlates logs and traces

Metrics

Application metrics are exported via OTLP and include:
  • HTTP request duration histograms
  • HTTP request counts by status code
  • Database connection pool utilization
  • Redis command latency

Logs

Structured JSON logs are written to stdout and optionally exported via OTLP:
FieldDescription
levelLog level (DEBUG, INFO, WARNING, ERROR)
messageLog message
timestampISO 8601 timestamp
serviceService name
trace_idW3C trace ID for log-trace correlation

OTel Collector Configuration

Deploy the OpenTelemetry Collector to receive, process, and export telemetry data.
# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 5s
    send_batch_size: 1024

exporters:
  # Traces
  otlp/tempo:
    endpoint: tempo.monitoring:4317
    tls:
      insecure: true

  # Metrics
  prometheusremotewrite:
    endpoint: http://prometheus.monitoring:9090/api/v1/write

  # Logs
  loki:
    endpoint: http://loki.monitoring:3100/loki/api/v1/push

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/tempo]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheusremotewrite]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [loki]

Helm Installation

helm install otel-collector open-telemetry/opentelemetry-collector \
  --namespace monitoring \
  --create-namespace \
  -f otel-collector-config.yaml

Metric Sources

SourceEndpointProtocol
Application services (OTel)OTLP gRPC (4317) / HTTP (4318)OpenTelemetry
Keycloak/keycloak/metrics (port 8080)Prometheus exposition format
Kuberneteskube-state-metrics, node-exporterPrometheus exposition format
PostgreSQLpg_exporter (optional)Prometheus exposition format
Redisredis_exporter (optional)Prometheus exposition format

Backend Options

ComponentPurpose
PrometheusMetrics storage and querying
Grafana TempoDistributed trace storage
LokiLog aggregation
GrafanaDashboards and visualization
AlertmanagerAlert routing and notification
Deploy via the Grafana LGTM Helm chart or individual component charts.

Next Steps

Helm Configuration

Telemetry environment variables in Helm values.

Prerequisites

Infrastructure requirements for the observability stack.