Trace Propagation and Public API Endpoints in .NET – Part 1 (Disable All)

The W3C Trace context specification is an amazing standard and a massive leap in standardisation of telemetry correlation in the current climate of microservices being the de facto for new systems (that’s a debate for another day).

One of the issues with the W3C trace context is that it doesn’t define any standards for how far a trace is to propagate. This means that if a third party accidentally sends trace headers from their service, you’ll be using their trace IDs and including their baggage data. This can have unwanted affects on your telemetry backend such as the trace showing missing root spans, or including multiple API calls in a single trace at the top level. This makes understanding and debugging trace data sometimes hard. Worse though, the baggage data from the third party could contain PII data, and therefore mean you’re processing PII without realising.

The Baggage issue

Imagine that you have a public API, which is called by your clients. This API also calls out to a third party for exchange rate information.

You’re really careful internally that you don’t set Personally Identifiable Information (PII) in baggage as you know that it will be sent to the Exchange Rate service of the third party.

It’s not your data that you’re passing onto the Exchange Rate service, however, those incoming baggage headers aren’t useful to you, and therefore should be ignored.

Trace Propagation in .NET

The W3C trace context in .NET is propagated in 2 separate ways. The first is inbuilt into the .NET Runtime using a class called DistributedContextPropagator, the second is part of OpenTelemetry using the TextMapPropagator class.

Both of these 2 classes need to be overridden in order for inbound (and outbound) propagation to be disabled in an ASP.NET Core site.

To override the DistributedContextPropagator you will need to remove the one added by the ASP.NET Core HostBuilder

using System.Diagnostics;

var builder = WebApplication.CreateBuilder(args);

 builder.Services.Remove(new ServiceDescriptor(
    typeof(DistributedContextPropagator),
    typeof(DistributedContextPropagator),
    ServiceLifetime.Singleton));
 builder.Services.AddSingleton<DistributedContextPropagator, CustomContextPropagator>();

// .. other service registrations

In order to override the OpenTelemetry Propagators, you need to register them with the SetDefaultTextMapPropagator method.

using OpenTelemetry;

Sdk.SetDefaultTextMapPropagator(new CompositeTextMapPropagator(
    new List<TextMapPropagator>() {
        new CustomPropagator()
    }));

If you need to inject some additional objects into your custom propagator, OpenTelemetry has a new method that’s run as the OpenTelemetry TracerProvider is created called ConfigureOpenTelemetryTracerProvider, which takes the TracerProviderBuilder and also the built ServiceProvider.

builder.Services.AddSingleton<CustomPropagator>();

services.ConfigureOpenTelemetryTracerProvider((sp, tp) =>{
    Sdk.SetDefaultTextMapPropagator(new CompositeTextMapPropagator(
        new List<TextMapPropagator>() {
            sp.GetRequiredService<CustomPropagator>()
        }));
});

Ignore All incoming trace data

The easiest way around the propagation issue is to ignore all incoming trace headers. This is fine if your service only has public endpoints. If you need something a little more granular, Part 2 includes some more details on how you can do this with a little more conditional logic.

First we create a derived class from DistributedContextPropagator

internal class DisableAllContextPropagator : DistributedContextPropagator
{
    public override IReadOnlyCollection<string> Fields { get; } = new ReadOnlyCollection<string>(new[] { "traceparent" });
    public override IEnumerable<KeyValuePair<string, string?>>? ExtractBaggage(object? carrier, PropagatorGetterCallback? getter)
    {
      throw new NotImplementedException();
    }

    public override void ExtractTraceIdAndState(object? carrier, PropagatorGetterCallback? getter, out string? traceId, out string? traceState)
    {
      throw new NotImplementedException();
    }

    public override void Inject(Activity? activity, object? carrier, PropagatorSetterCallback? setter)
    {
      throw new NotImplementedException();
    }
}

Here, we have 3 methods that we’re interested in. The first 2 (ExtractBaggage and ExtractTraceIdAndState) are about extracting the inbound trace context, whereas the last one (Inject) is about pushing our current trace context onto our downstream services. We still want downstream trace propagation to work as that’s important for our internal distributed tracing to produce a correlated Trace Waterfall, so for that, we’ll bring in the Default propagator and delegate to that. CreateDefaultPropagator is a static method on DistributedContextPropagator that will create what would have been the propagator if we didn’t override. Right now (.NET 7) it returns a LegacyPropagator.

internal class DisableAllContextPropagator : DistributedContextPropagator
{
    private readonly DistributedContextPropagator _legacy = CreateDefaultPropagator();

   // other code

    public override void Inject(Activity? activity, object? carrier, PropagatorSetterCallback? setter)
    {
        _legacy.Inject(activity, carrier, setter);
    }
}

For our other 2 methods, we want to return defaults as we don’t want to take into any inbound context data.

internal class DisableAllContextPropagator : DistributedContextPropagator
{
   // other code

    public override IEnumerable<KeyValuePair<string, string?>>? ExtractBaggage(object? carrier, PropagatorGetterCallback? getter)
    {
        return Enumerable.Empty<KeyValuePair<string, string?>>();
    }

    public override void ExtractTraceIdAndState(object? carrier, PropagatorGetterCallback? getter, out string? traceId, out string? traceState)
    {
        traceId = null;
        traceState = null;
        return;
    }

   // other code

}

We then need to do the same for the OpenTelemetry Propagators. In OpenTelemetry however, there are 2 separate propagators. One is for the Trace Context (TraceContextPropagator), and the other is for the Baggage (BaggagePropagator). The code is pretty similar, and the logic is the same. The class these are derived from is the TextMapPropagator which has only 2 methods we’re interested in.

 internal class DisableAllTracePropagator : TraceContextPropagator
{
    public override PropagationContext Extract<T>(PropagationContext currentContext, T carrier, Func<T, string, IEnumerable<string>> getter)
    {
        throw new NotImplementedException();
    }

    public override void Inject<T>(PropagationContext context, T carrier, Action<T, string, string> setter)
    {
        throw new NotImplementedException();
    }
}

As in the DistributedContextPropagator we want to return defaults from the Extract<T> method, and delegate the Inject<T> method to what would have been the existing Propagator.

    public override PropagationContext Extract<T>(PropagationContext currentContext, T carrier, Func<T, string, IEnumerable<string>> getter)
    {
        return new PropagationContext(new ActivityContext(), new Baggage());
    }

    public override void Inject<T>(PropagationContext context, T carrier, Action<T, string, string> setter)
    {
        base.Inject(context, carrier, setter);
    }

Repeat the same code for the BaggagePropagator.

Once we have all the classes, we need to register them, I do this with an extension to the IServiceCollection as it encapsulates the setup nicely and gives it context without having to use comments and sections.

    public static IServiceCollection DisableInboundTracePropagation(this IServiceCollection services)
    {
        services.Remove(new ServiceDescriptor(typeof(DistributedContextPropagator), typeof(DistributedContextPropagator), ServiceLifetime.Singleton));
        services.AddSingleton<DistributedContextPropagator, DisableAllContextPropagator>();

        services.ConfigureOpenTelemetryTracerProvider((sp, tp) =>{
            Sdk.SetDefaultTextMapPropagator(new CompositeTextMapPropagator(
                new List<TextMapPropagator>() {
                    new DisableAllTracePropagator(),
                    new DisableAllBaggagePropagator()
                }));
        });
        return services;
    }

Conclusion

Trace Propagation is the true super power of Production debugging of distributed systems, as the movies say, with “With great power, comes great responsibility”. You need to consider carefully whether you trust your consumers to not provide those headers, whether you’re going to strip them before they make it to your Application, or whether you want to be a little more clever.

In the next post, I’ll cover a more advanced approach to deciding when to trust inbound based on criteria from the request like allowing it for specific endpoints.

3 thoughts on “Trace Propagation and Public API Endpoints in .NET – Part 1 (Disable All)

Add yours

  1. Great Article! You note a Part 2 – is this still in the works?

    “If you need something a little more granular, Part 2 includes some more details on how you can do this with a little more conditional logic.”

    Like

    1. Hey, I’ve not had much feedback on whether that’s something people wanted so I’ve not prioritised it. What is it you’d like to see from Part 2, is there something particular that would be interesting?

      Like

Leave a comment

Website Built with WordPress.com.

Up ↑