Introduction
OpenTelemetry is an open-source observability framework that provides mechanisms for creating and sending traces, metrics, and logs. It consists of various elements such as protocols for transmission and SDKs for different programming languages. In this article, we will explore how OpenTelemetry achieves distributed tracing.
What is Distributed Tracing?
Distributed tracing is a technique for tracking and monitoring traces across multiple servers, like microservices. It helps to visualize and understand the flow of a request as it traverses through different services.
Key Components of Distributed Tracing
- Trace: A collection of spans representing a single request or transaction.
- Span: A single unit of work within a trace, representing a specific operation.
A trace is a tree structure composed of multiple spans. Here's a visual representation:
Explain Like I'm 5 explanation about Distributed Tracing (for LinkedIn users, for Twitter users)
Understanding Trace from Span
To achieve distributed tracing, it is essential to understand the relationship between traces and spans. Each span includes the following elements:
- TraceId: The ID of the trace to which the span belongs.
- SpanId: A unique ID for the span within the trace.
- ParentSpanId: The ID of the parent span.
These elements are specified in the span using Protocol Buffers.
Example of Span Elements in a Trace
Consider the following Go code example:
package main
import (
"context"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/trace"
)
func CreateTrace() {
tracer := otel.Tracer("example-tracer")
ctx, parentSpan := tracer.Start(context.Background(), "parent-span")
defer parentSpan.End()
ctx, childSpan := tracer.Start(ctx, "child-span")
defer childSpan.End()
}
If you print this to stdout, it will output something like this:
{
"Name": "child-span",
"SpanContext": {
"TraceID": "9023c11c3272a955da5f499faa9afa71",
"SpanID": "ca44f59e13b40d44"
},
"Parent": {
"TraceID": "9023c11c3272a955da5f499faa9afa71",
"SpanID": "70e471ef5735034d"
}
}
{
"Name": "parent-span",
"SpanContext": {
"TraceID": "9023c11c3272a955da5f499faa9afa71",
"SpanID": "70e471ef5735034d"
},
"Parent": {
"TraceID": "00000000000000000000000000000000",
"SpanID": "0000000000000000"
}
}
In this example, the TraceId
is the same for both spans, indicating they belong to the same trace. The ParentSpanId
of the child span matches the SpanId
of the parent span, establishing a parent-child relationship.
Propagation of Trace Context
To enable distributed tracing across multiple services, the trace context needs to be propagated. This is achieved by passing the TraceId
and SpanId
through headers in HTTP requests.
W3C Trace Context
The W3C Trace Context specification standardizes how trace context information is passed. The traceparent
header is used in HTTP requests with the format: ${version}-${trace-id}-${parent-id}-${trace-flags}
.
Example using curl
:
curl -H "traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" localhost
Propagation in Go
Here's an example of a server and client in Go that demonstrates trace context propagation:
Server Code
package main
import (
"fmt"
"net/http"
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
"go.opentelemetry.io/otel/sdk/trace"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
func RunServer() {
exp, _ := stdouttrace.New()
tp := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exp),
)
otelHandler := otelhttp.NewHandler(http.HandlerFunc(handler), "handle-request", otelhttp.WithTracerProvider(tp))
http.Handle("/", otelHandler)
http.ListenAndServe(":9002", nil)
}
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Println("handled")
}
Client Code
package main
import (
"context"
"io"
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"go.opentelemetry.io/otel"
)
func CreatePropagationTrace() {
tracer := otel.Tracer("example-tracer")
ctx, span := tracer.Start(context.Background(), "hello-span")
defer span.End()
req, _ := otelhttp.Get(ctx, "http://localhost:9002")
io.ReadAll(req.Body)
}
Output
When you run the server and client, the output will show the trace propagation:
Server Trace
{
"Name": "handle-request",
"SpanContext": {
"TraceID": "817f4043c5837f2bbb44562f3683f274",
"SpanID": "3bba3b994e029bfc"
},
"Parent": {
"TraceID": "817f4043c5837f2bbb44562f3683f274",
"SpanID": "892d624c6f0c01a6"
}
}
Client Trace
{
"Name": "HTTP GET",
"SpanContext": {
"TraceID": "817f4043c5837f2bbb44562f3683f274",
"SpanID": "892d624c6f0c01a6"
},
"Parent": {
"TraceID": "817f4043c5837f2bbb44562f3683f274",
"SpanID": "1f312e90fb65c0e3"
}
}
{
"Name": "hello-span",
"SpanContext": {
"TraceID": "817f4043c5837f2bbb44562f3683f274",
"SpanID": "1f312e90fb65c0e3"
},
"Parent": {
"TraceID": "00000000000000000000000000000000",
"SpanID": "0000000000000000"
}
}
Conclusion
Distributed tracing with OpenTelemetry enables us to track and monitor requests across multiple services by passing trace context through headers. By understanding and implementing the elements of TraceId
, SpanId
, and ParentSpanId
, we can visualize the flow of a request and diagnose issues more effectively.
With the standardized W3C Trace Context, trace context propagation becomes consistent and interoperable across different services and platforms.
This article has covered the basics of how OpenTelemetry achieves distributed tracing, providing code examples and visualizations to illustrate the concepts. Happy tracing!
For more details, visit the OpenTelemetry Documentation.
For more tips and insights on monitoring and tech, follow me on Twitter @Siddhant_K_code and stay updated with the latest & detailed tech content like this. Happy coding!