FastDeploy Tracing with OpenTelemetry
FastDeploy exports request tracing data through the OpenTelemetry Collector.
Tracing can be enabled when starting the server using the --trace-enable flag, and the OpenTelemetry Collector endpoint can be configured via --otlp-traces-endpoint.
Setup Guide
1. Install Dependencies
# Manual installation
pip install opentelemetry-sdk \
opentelemetry-api \
opentelemetry-exporter-otlp \
opentelemetry-exporter-otlp-proto-grpc
2. Start OpenTelemetry Collector and Jaeger
docker compose -f examples/observability/tracing/tracing_compose.yaml up -d
3. Start FastDeploy Server with Tracing Enabled
Configure FastDeploy Environment Variables
# Enable tracing
"TRACES_ENABLE": "true",
# Service name
"FD_SERVICE_NAME": "FastDeploy",
# Instance name
"FD_HOST_NAME": "trace_test",
# Exporter type
"TRACES_EXPORTER": "otlp",
# OTLP endpoint:
# gRPC: 4317
# HTTP: 4318
"EXPORTER_OTLP_ENDPOINT": "http://localhost:4317",
# Optional headers
"EXPORTER_OTLP_HEADERS": "Authentication=Txxxxx",
# Export protocol
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL": "grpc",
Start FastDeploy
Start the FastDeploy server with the above configuration and ensure that tracing is enabled.
4. Send Requests and View Traces
- Open the Jaeger UI in your browser (port
16686) to visualize request traces. - The OpenTelemetry Collector will also export the trace data to a local file:
/tmp/otel_trace.json
Adding Tracing to Your Own Code
FastDeploy already inserts tracing points at most critical execution stages.
Developers can use the APIs provided in trace.py to add more fine-grained tracing.
4.1 Initialize Tracing
Each process involved in tracing must call:
process_tracing_init()
Each thread that participates in a traced request must call:
trace_set_thread_info("thread_label", tp_rank, dp_rank)
thread_label: identifier used for visual distinction of threads.tp_rank/dp_rank: optional values to label tensor parallelism or data parallelism ranks.
4.2 Mark Request Start and Finish
trace_req_start(rid, bootstrap_room, ts, role)
trace_req_finish(rid, ts, attrs)
- Creates both a Bootstrap Room Span and a Root Span.
- Supports inheritance from spans created by the FastAPI Instrumentor (context copying).
attrscan be used to attach additional attributes to the request span.
4.3 Add Tracing for Slices
Standard Slice
trace_slice_start("slice_name", rid)
trace_slice_end("slice_name", rid)
Mark Thread Completion
The last slice in a thread can mark the thread span as finished:
trace_slice_end("slice_name", rid, thread_finish_flag=True)
4.4 Trace Context Propagation Across Threads
Sender Side (ZMQ)
trace_context = trace_get_proc_propagate_context(rid)
req.trace_context = trace_context
Receiver Side (ZMQ)
trace_set_proc_propagate_context(rid, req.trace_context)
4.5 Add Events and Attributes
Events (recorded on the current slice)
trace_event("event_name", rid, ts, attrs)
Attributes (attached to the current slice)
trace_slice_add_attr(rid, attrs)
Extending the Tracing Framework
5.1 Trace Context Hierarchy
-
Two levels of Trace Context:
-
TraceReqContext– request-level context -
TraceThreadContext– thread-level context -
Three-level Span hierarchy:
-
req_root_span thread_spanslice_span
5.2 Available Span Name Enum (TraceSpanName)
FASTDEPLOY
PREPROCESS
SCHEDULE
PREFILL
DECODE
POSTPROCESS
- These enums can be used when creating slices to ensure consistent naming.
5.3 Important Notes
- Each thread span must be closed when the final slice of that thread finishes.
- Spans created by FastAPI Instrumentor are automatically inherited by the internal tracing context.