Distributed Tracing in ConnectSoft Microservice Template¶
Purpose & Overview¶
Distributed Tracing provides end-to-end visibility into requests as they flow through multiple services, components, and dependencies. In the ConnectSoft Microservice Template, distributed tracing is implemented using OpenTelemetry, enabling teams to trace requests across service boundaries, understand system behavior, diagnose performance issues, and correlate logs, metrics, and traces.
Why Distributed Tracing?¶
Distributed tracing offers several critical benefits:
- End-to-End Visibility: See complete request flow across services and components
- Performance Diagnostics: Identify bottlenecks and slow operations in distributed systems
- Root Cause Analysis: Understand failure points and error propagation
- Dependency Mapping: Visualize service dependencies and interactions
- Log Correlation: Link logs to specific traces and spans
- Service Boundaries: Understand how requests traverse microservice boundaries
- Debugging: Trace specific requests through complex distributed systems
- Performance Optimization: Identify optimization opportunities across services
Distributed Tracing Philosophy
Distributed tracing treats every operation as part of a trace—a tree of spans that represents the complete journey of a request through the system. Each span captures timing, context, and metadata, enabling comprehensive observability and debugging of distributed systems.
Architecture Overview¶
OpenTelemetry Integration¶
The template uses OpenTelemetry as the distributed tracing framework:
Request Flow
↓
OpenTelemetry SDK
├── Automatic Instrumentation
│ ├── ASP.NET Core (HTTP requests)
│ ├── HttpClient (outgoing HTTP)
│ ├── gRPC (client & server)
│ ├── SQL Client (database operations)
│ ├── Redis (cache operations)
│ ├── MassTransit (messaging)
│ ├── NServiceBus (messaging)
│ ├── Orleans (actor invocations)
│ └── SignalR (real-time connections)
├── Custom Activities
│ └── ActivitySource-based spans
└── Trace Context Propagation
├── W3C Trace Context (HTTP headers)
└── Activity.Current propagation
↓
OTLP Exporters
├── OpenTelemetry Collector
└── Seq (OTLP endpoint)
↓
Observability Backends
├── Jaeger
├── Zipkin
├── Application Insights
└── Other OTLP-compatible systems
Trace Structure¶
Trace Hierarchy:
Trace (Root)
├── Span: HTTP Request (API Gateway)
│ ├── Span: Process Request (Application)
│ │ ├── Span: Database Query (NHibernate)
│ │ ├── Span: Message Publish (MassTransit)
│ │ └── Span: External API Call (HttpClient)
│ └── Span: gRPC Call (Service B)
│ └── Span: Process Request (Service B)
└── Span: Background Job (Hangfire)
└── Span: Process Message
Key Concepts¶
| Concept | Description |
|---|---|
| Trace | Complete request journey across services |
| Span | Individual operation within a trace |
| Activity | .NET representation of a span |
| ActivitySource | Factory for creating activities |
| Trace Context | Propagation metadata (trace ID, span ID) |
| Correlation ID | Business-level correlation identifier |
Service Registration¶
OpenTelemetry Configuration¶
Registration:
// MicroserviceRegistrationExtensions.cs
#if OpenTelemetry
services.AddMicroserviceOpenTelemetry(webHostEnvironment);
#endif
Implementation:
// OpenTelemetryExtensions.cs
internal static IServiceCollection AddMicroserviceOpenTelemetry(
this IServiceCollection services,
IWebHostEnvironment webHostEnvironment)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(webHostEnvironment);
services.AddOpenTelemetry()
.ConfigureResource(resourceBuilder =>
{
resourceBuilder.ConfigureMicroserviceOpenTelemetryResource(webHostEnvironment);
})
.WithMetrics(metricsBuilder =>
{
metricsBuilder.ConfigureMicroserviceOpenTelemetryMetrics(OptionsExtensions.OpenTelemetryOptions);
})
.WithTracing(tracingBuilder =>
{
tracingBuilder.ConfigureMicroserviceOpenTelemetryTracing(webHostEnvironment, OptionsExtensions.OpenTelemetryOptions);
});
// Enable ActivitySource support for Azure SDKs
AppContext.SetSwitch("Azure.Experimental.EnableActivitySource", isEnabled: true);
return services;
}
Resource Configuration¶
Service Identity:
private static void ConfigureMicroserviceOpenTelemetryResource(
this ResourceBuilder resourceBuilder,
IWebHostEnvironment webHostEnvironment)
{
// Add service attributes
resourceBuilder.AddService(
serviceName: webHostEnvironment.ApplicationName,
serviceVersion: "1.0.0",
serviceNamespace: "ConnectSoft.MicroserviceTemplate",
serviceInstanceId: Environment.MachineName);
// Add SDK version
resourceBuilder.AddTelemetrySdk();
// Add custom attributes
resourceBuilder.AddAttributes(
new KeyValuePair<string, object>[]
{
new (OpenTelemetryAttributeName.Deployment.Environment, webHostEnvironment.EnvironmentName),
new (OpenTelemetryAttributeName.Host.Name, Environment.MachineName),
new (OpenTelemetryAttributeName.OperatingSystem.Description, Environment.OSVersion.VersionString),
});
resourceBuilder.AddEnvironmentVariableDetector();
}
Automatic Instrumentation¶
ASP.NET Core Instrumentation¶
HTTP Request Tracing:
// OpenTelemetryExtensions.cs
tracingBuilder
.AddAspNetCoreInstrumentation(options =>
{
options.EnrichWithHttpRequest = EnrichWithHttpRequest;
options.EnrichWithHttpResponse = EnrichWithHttpResponse;
options.RecordException = true;
});
Automatic Capture: - HTTP method and path - Request/response headers - Status codes - Request/response sizes - Duration - Client IP address - User identity (if authenticated)
Request Enrichment:
private static void EnrichWithHttpRequest(Activity activity, HttpRequest request)
{
var context = request.HttpContext;
activity.AddTag(OpenTelemetryAttributeName.Http.ClientIP, context.Connection.RemoteIpAddress);
activity.AddTag(OpenTelemetryAttributeName.Http.RequestContentLength, request.ContentLength);
activity.AddTag(OpenTelemetryAttributeName.Http.RequestContentType, request.ContentType);
var user = context.User;
if (user.Identity?.Name is not null)
{
activity.AddTag(OpenTelemetryAttributeName.EndUser.Id, user.Identity.Name);
activity.AddTag(OpenTelemetryAttributeName.EndUser.Scope, string.Join(',', user.Claims.Select(x => x.Value)));
}
}
HttpClient Instrumentation¶
Outgoing HTTP Calls:
tracingBuilder
.AddHttpClientInstrumentation(options =>
{
options.FilterHttpRequestMessage = message =>
message is not null &&
message.RequestUri is not null &&
!message.RequestUri.Host.Contains("visualstudio", StringComparison.Ordinal) &&
!message.RequestUri.Host.Contains("applicationinsights", StringComparison.Ordinal);
});
Automatic Capture: - HTTP method and URL - Request/response headers - Status codes - Duration - Target host
gRPC Instrumentation¶
gRPC Client and Server:
Automatic Capture: - gRPC method name - Request/response metadata - Status codes - Duration - Peer information
SQL Client Instrumentation¶
Database Operations:
#if UseNHibernate
tracingBuilder
.AddSqlClientInstrumentation(options =>
{
// Capture CommandType.Text (default = false)
options.SetDbStatementForText = true;
// Record SQLExceptions as activity events
options.RecordException = true;
// Enrich with command timeout
options.Enrich = (activity, eventName, rawObject) =>
{
if (eventName.Equals("OnCustom", StringComparison.Ordinal) && rawObject is SqlCommand msCmd)
{
activity.SetTag("db.commandTimeout", msCmd.CommandTimeout);
}
};
});
#endif
Automatic Capture: - SQL command text (sanitized) - Command type - Connection string (sanitized) - Duration - Exceptions
Redis Instrumentation¶
Cache Operations:
Automatic Capture: - Redis commands - Keys (sanitized) - Duration - Connection information
Messaging Instrumentation¶
MassTransit:
#if UseMassTransit
tracingBuilder
.AddSource(MassTransit.Logging.DiagnosticHeaders.DefaultListenerName);
#endif
NServiceBus:
Automatic Capture: - Message type - Message ID - Correlation ID - Duration - Processing status
Orleans Instrumentation¶
Actor Invocations:
#if UseOrleans
tracingBuilder
.AddSource("Microsoft.Orleans.Runtime")
.AddSource("Microsoft.Orleans.Application");
#endif
Activity Propagation:
Automatic Capture: - Grain method invocations - Grain ID - Activation information - Duration - Exceptions
MongoDB Instrumentation¶
Document Operations:
#if UseMongoDb
tracingBuilder
.AddSource("MongoDB.Driver.Core.Extensions.DiagnosticSources");
#endif
Hangfire Instrumentation¶
Background Jobs:
Automatic Capture: - Job name - Job ID - Duration - Execution status - Exceptions
SignalR Instrumentation¶
Real-Time Connections:
Semantic Kernel Instrumentation¶
AI Operations:
OpenAI Instrumentation¶
OpenAI API Calls:
#if (UseMicrosoftExtensionsAIOpenAIProvider || UseSemanticKernelOpenAIConnector)
tracingBuilder
.AddSource("OpenAI.*");
#endif
Custom Activities¶
Creating ActivitySource¶
Define ActivitySource:
using System.Diagnostics;
public static class Tracing
{
private static readonly ActivitySource ActivitySource = new("ConnectSoft.MicroserviceTemplate");
public static ActivitySource Source => ActivitySource;
}
Register ActivitySource:
// OpenTelemetryExtensions.cs
tracingBuilder
.AddSource("ConnectSoft.MicroserviceTemplate"); // Subscribes to all ActivitySources starting with this name
Creating Custom Spans¶
Basic Activity:
using var activity = Tracing.Source.StartActivity("ProcessOrder");
try
{
// Business logic
await ProcessOrderAsync(order);
activity?.SetStatus(ActivityStatusCode.Ok);
}
catch (Exception ex)
{
activity?.SetStatus(ActivityStatusCode.Error, ex.Message);
activity?.RecordException(ex);
throw;
}
Activity with Tags:
using var activity = Tracing.Source.StartActivity("ValidateOrder");
activity?.SetTag("order.id", order.Id);
activity?.SetTag("order.customerId", order.CustomerId);
activity?.SetTag("order.total", order.TotalAmount);
activity?.SetTag("order.items", order.Items.Count);
// Business logic
await ValidateOrderAsync(order);
Activity with Events:
using var activity = Tracing.Source.StartActivity("ProcessPayment");
activity?.AddEvent(new ActivityEvent("Payment.Initiated"));
activity?.SetTag("payment.method", payment.Method);
activity?.SetTag("payment.amount", payment.Amount);
await ProcessPaymentAsync(payment);
activity?.AddEvent(new ActivityEvent("Payment.Completed"));
Activity with Baggage:
using var activity = Tracing.Source.StartActivity("ProcessRequest");
// Add baggage (propagated across services)
activity?.SetBaggage("user.id", userId);
activity?.SetBaggage("request.source", "mobile-app");
// Business logic
await ProcessRequestAsync(request);
Nested Activities¶
Child Spans:
using var parentActivity = Tracing.Source.StartActivity("CreateOrder");
try
{
// Child activity 1
using var validationActivity = Tracing.Source.StartActivity("ValidateOrder");
await ValidateOrderAsync(order);
validationActivity?.SetStatus(ActivityStatusCode.Ok);
// Child activity 2
using var inventoryActivity = Tracing.Source.StartActivity("CheckInventory");
await CheckInventoryAsync(order);
inventoryActivity?.SetStatus(ActivityStatusCode.Ok);
// Child activity 3
using var paymentActivity = Tracing.Source.StartActivity("ProcessPayment");
await ProcessPaymentAsync(order);
paymentActivity?.SetStatus(ActivityStatusCode.Ok);
parentActivity?.SetStatus(ActivityStatusCode.Ok);
}
catch (Exception ex)
{
parentActivity?.SetStatus(ActivityStatusCode.Error, ex.Message);
parentActivity?.RecordException(ex);
throw;
}
Activity Attributes¶
Semantic Conventions:
using var activity = Tracing.Source.StartActivity("Database.Query");
activity?.SetTag("db.system", "postgresql");
activity?.SetTag("db.name", "orders");
activity?.SetTag("db.operation", "SELECT");
activity?.SetTag("db.statement", "SELECT * FROM orders WHERE id = @id");
activity?.SetTag("db.user", "app_user");
Custom Attributes:
activity?.SetTag("business.order.id", orderId);
activity?.SetTag("business.customer.id", customerId);
activity?.SetTag("business.operation.type", "order.creation");
Trace Context Propagation¶
W3C Trace Context¶
HTTP Headers:
OpenTelemetry automatically propagates trace context via HTTP headers:
- traceparent: W3C Trace Context header
- tracestate: Additional trace state
Automatic Propagation:
// Outgoing HTTP request automatically includes trace context
using var httpClient = new HttpClient();
var response = await httpClient.GetAsync("https://api.example.com/users");
// Trace context is automatically included in headers
Manual Propagation:
using var activity = Tracing.Source.StartActivity("CallExternalService");
var request = new HttpRequestMessage(HttpMethod.Get, "https://api.example.com/users");
// Trace context is automatically propagated by HttpClient instrumentation
// But you can also manually add headers if needed
var traceParent = Activity.Current?.Id;
if (traceParent != null)
{
request.Headers.Add("traceparent", traceParent);
}
var response = await httpClient.SendAsync(request);
Activity.Current Propagation¶
Automatic Context Propagation:
// Activity.Current is automatically propagated across async boundaries
using var activity = Tracing.Source.StartActivity("ProcessOrder");
await Task.Run(async () =>
{
// Activity.Current is available here
using var childActivity = Tracing.Source.StartActivity("CalculateTotal");
// Child activity is automatically linked to parent
});
Message-Based Propagation¶
MassTransit:
// Correlation ID is automatically propagated
GlobalTopology.Send.UseCorrelationId<MicroserviceAggregateRootCreatedEvent>(x => x.ObjectId);
NServiceBus:
Trace context is automatically propagated via message headers.
Correlation IDs¶
Conversation ID Middleware¶
Request Correlation:
Purpose: - Generates correlation ID for each request - Propagates correlation ID across services - Links logs to specific requests - Enables request tracking
Using Correlation IDs¶
In Logs:
// Correlation ID is automatically included in logs
_logger.LogInformation("Processing order {OrderId}", orderId);
// Log includes: traceId, spanId, correlationId
In Custom Activities:
using var activity = Tracing.Source.StartActivity("ProcessOrder");
// Get correlation ID from current context
var correlationId = Activity.Current?.Baggage.GetBaggage("correlation.id");
activity?.SetTag("correlation.id", correlationId);
Sampling¶
Sampling Configuration¶
Development (Always On):
if (webHostEnvironment.IsDevelopment())
{
// View all traces in development
tracingBuilder.SetSampler(new AlwaysOnSampler());
}
Production (Adaptive Sampling):
// Default: Head-based sampling
// Samples traces based on trace ID hash
// Ensures consistent sampling across services
Custom Sampler:
Sampling Strategies¶
| Strategy | Use Case | Description |
|---|---|---|
| AlwaysOnSampler | Development | Sample all traces |
| AlwaysOffSampler | Testing | Sample no traces |
| TraceIdRatioBasedSampler | Production | Sample percentage based on trace ID |
| ParentBasedSampler | Multi-service | Respect parent sampling decision |
Exporters¶
OTLP Exporter¶
OpenTelemetry Collector:
tracingBuilder.AddOtlpExporter(options =>
{
options.Protocol = (OtlpExportProtocol)otelOptions.OtlpExporter.OtlpExportProtocol;
options.Endpoint = new Uri(otelOptions.OtlpExporter.Endpoint);
});
Configuration:
{
"OpenTelemetry": {
"OtlpExporter": {
"OtlpExportProtocol": "Grpc",
"Endpoint": "http://localhost:4317"
}
}
}
Seq OTLP Exporter¶
Seq Integration:
tracingBuilder.AddOtlpExporter(options =>
{
options.Protocol = (OtlpExportProtocol)otelOptions.OtlpSeqExporter.OtlpExportProtocol;
options.Endpoint = new Uri($"{otelOptions.OtlpSeqExporter.Endpoint}/ingest/otlp/v1/traces");
});
Configuration:
{
"OpenTelemetry": {
"OtlpSeqExporter": {
"OtlpExportProtocol": "HttpProtobuf",
"Endpoint": "http://localhost:5341"
}
}
}
Console Exporter¶
Development Debugging:
if (otelOptions.EnableConsoleExporter)
{
tracingBuilder.AddConsoleExporter(options =>
{
options.Targets = ConsoleExporterOutputTargets.Console | ConsoleExporterOutputTargets.Debug;
});
}
Logging Integration¶
Trace-Linked Logging¶
Automatic Enrichment:
// Program.cs
#if OpenTelemetry
logging.AddOpenTelemetry(options =>
{
options.IncludeScopes = true;
options.ParseStateValues = true;
});
#endif
Enriched Logs:
{
"timestamp": "2025-01-15T17:22:58Z",
"level": "Information",
"message": "Processing order {OrderId}",
"orderId": "ORD-7788",
"traceId": "00-ab12cd34ef567890abcdef1234567890",
"spanId": "1a2b3c4d5e6f7890",
"parentId": "00-xyz1234567890abcdef1234567890",
"traceFlags": "01",
"service": "OrderService",
"environment": "Production"
}
Benefits¶
- End-to-End Visibility: Every log linked to a trace span
- Root Cause Isolation: See logs before/after failures in trace context
- Cross-Service Diagnostics: Correlate logs across service boundaries
- Telemetry-to-Log Drilldown: Navigate from traces to specific log records
Configuration¶
OpenTelemetry Options¶
Configuration Class:
public sealed class OpenTelemetryOptions
{
public const string OpenTelemetryOptionsSectionName = "OpenTelemetry";
[Required]
required public bool EnableConsoleExporter { get; set; }
[Required]
[ValidateObjectMembers]
required public OtlpExporterOptions OtlpExporter { get; set; }
[Required]
[ValidateObjectMembers]
required public OtlpSeqExporterOptions OtlpSeqExporter { get; set; }
}
appsettings.json:
{
"OpenTelemetry": {
"EnableConsoleExporter": false,
"OtlpExporter": {
"OtlpExportProtocol": "Grpc",
"Endpoint": "http://localhost:4317"
},
"OtlpSeqExporter": {
"OtlpExportProtocol": "HttpProtobuf",
"Endpoint": "http://localhost:5341"
}
}
}
Best Practices¶
Do's¶
-
Use Semantic Conventions
-
Create Meaningful Activity Names
// ✅ GOOD - Descriptive activity names using var activity = Tracing.Source.StartActivity("Order.Validate"); using var activity = Tracing.Source.StartActivity("Payment.Process"); // ❌ BAD - Generic names using var activity = Tracing.Source.StartActivity("Process"); using var activity = Tracing.Source.StartActivity("DoWork"); -
Set Activity Status
-
Add Relevant Tags
-
Use ActivitySource for Custom Activities
-
Keep Activities Focused
// ✅ GOOD - One activity per logical operation using var activity = Tracing.Source.StartActivity("ValidateOrder"); await ValidateOrderAsync(order); // ❌ BAD - One activity for everything using var activity = Tracing.Source.StartActivity("ProcessOrder"); await ValidateOrderAsync(order); await CheckInventoryAsync(order); await ProcessPaymentAsync(order);
Don'ts¶
-
Don't Create Too Many Activities
// ❌ BAD - Too granular using var activity1 = Tracing.Source.StartActivity("GetOrder"); using var activity2 = Tracing.Source.StartActivity("ParseOrder"); using var activity3 = Tracing.Source.StartActivity("ValidateOrder"); // ✅ GOOD - Reasonable granularity using var activity = Tracing.Source.StartActivity("GetAndValidateOrder"); -
Don't Include Sensitive Data in Tags
-
Don't Forget to Dispose Activities
-
Don't Create Activities in High-Frequency Code
// ❌ BAD - Activity in tight loop for (int i = 0; i < 1000000; i++) { using var activity = Tracing.Source.StartActivity("ProcessItem"); ProcessItem(items[i]); } // ✅ GOOD - Activity outside loop using var activity = Tracing.Source.StartActivity("ProcessItems"); for (int i = 0; i < 1000000; i++) { ProcessItem(items[i]); }
Troubleshooting¶
Issue: No Traces Appearing¶
Symptoms: Traces not visible in observability backend.
Solutions:
1. Verify OpenTelemetry is enabled: #if OpenTelemetry
2. Check exporter endpoint is accessible
3. Verify OTLP protocol matches collector configuration
4. Check sampling configuration (may be sampling out)
5. Verify ActivitySource is registered: .AddSource("YourSourceName")
6. Check console exporter for local debugging
Issue: Incomplete Traces¶
Symptoms: Traces missing some spans.
Solutions: 1. Verify instrumentation is enabled for all frameworks 2. Check ActivitySource names match exactly 3. Verify trace context propagation is working 4. Check for exceptions in span creation 5. Verify sampling isn't filtering out spans
Issue: Trace Context Not Propagating¶
Symptoms: Spans not linked across services.
Solutions: 1. Verify W3C Trace Context headers are passed 2. Check HttpClient instrumentation is enabled 3. Verify message headers include trace context 4. Check for proxy/load balancer stripping headers 5. Verify Activity.Current is available in async code
Issue: High Overhead¶
Symptoms: Tracing causing performance degradation.
Solutions: 1. Enable sampling (reduce trace volume) 2. Use appropriate sampling rate for production 3. Filter out low-value traces 4. Optimize activity creation (avoid in tight loops) 5. Consider async activity creation
Advanced Patterns¶
Custom Sampler¶
Business-Aware Sampling:
public class BusinessValueSampler : Sampler
{
public override SamplingResult ShouldSample(in SamplingParameters samplingParameters)
{
// Sample all errors
if (samplingParameters.Tags.ContainsKey("error"))
{
return new SamplingResult(SamplingDecision.RecordAndSample);
}
// Sample high-value operations
if (samplingParameters.Tags.ContainsKey("business.value") &&
samplingParameters.Tags["business.value"] == "high")
{
return new SamplingResult(SamplingDecision.RecordAndSample);
}
// Sample 10% of others
return new SamplingResult(
samplingParameters.TraceId.GetHashCode() % 10 == 0
? SamplingDecision.RecordAndSample
: SamplingDecision.Drop);
}
}
Activity Enrichment¶
Runtime Enrichment:
services.AddOpenTelemetry()
.WithTracing(builder =>
{
builder.AddProcessor(new ActivityEnrichingProcessor());
});
public class ActivityEnrichingProcessor : BaseProcessor<Activity>
{
public override void OnStart(Activity activity)
{
// Add environment-specific tags
activity.SetTag("deployment.environment", Environment.GetEnvironmentVariable("ASPNETCORE_ENVIRONMENT"));
activity.SetTag("service.version", Assembly.GetExecutingAssembly().GetName().Version?.ToString());
}
}
Distributed Context Baggage¶
Cross-Service Context:
// Service A
using var activity = Tracing.Source.StartActivity("ProcessRequest");
activity?.SetBaggage("user.id", userId);
activity?.SetBaggage("request.source", "mobile-app");
// Service B (receives baggage automatically)
var userId = Activity.Current?.Baggage.GetBaggage("user.id");
var source = Activity.Current?.Baggage.GetBaggage("request.source");
Summary¶
Distributed Tracing in the ConnectSoft Microservice Template provides:
- ✅ OpenTelemetry Integration: Industry-standard distributed tracing
- ✅ Automatic Instrumentation: HTTP, gRPC, SQL, messaging, actors, and more
- ✅ Custom Activities: ActivitySource-based custom spans
- ✅ Trace Context Propagation: W3C Trace Context across services
- ✅ Log Correlation: Trace-linked logging for end-to-end visibility
- ✅ Multiple Exporters: OTLP, Seq, Console exporters
- ✅ Sampling: Configurable sampling for production
- ✅ Resource Attributes: Service identity and environment metadata
By following these patterns, teams can:
- Trace Requests: See complete request flow across services
- Diagnose Issues: Identify bottlenecks and failures quickly
- Understand Dependencies: Visualize service interactions
- Correlate Logs: Link logs to specific traces
- Optimize Performance: Identify slow operations
- Debug Efficiently: Trace specific requests through complex systems
Distributed tracing is essential for observability in microservice architectures, providing the visibility needed to understand, debug, and optimize distributed systems.