Skip to content

Commit 0f1ea2f

Browse files
mydeaLms24
andauthored
fix(opentelemetry): Fix span & sampling propagation (#11092)
OK, this was a tricky one, but I _think_ it now works as expected. This PR fixes to fundamental issues with sampling & propagation that were uncovered by @Lms24 & myself while trying to use OTEL for remix & sveltekit: 1. `continueTrace` updates the propagation context, but if there is an active parent span (even a remote one) this is ignored. 2. Sampling inheritance did not work as expected, due to the fact that OTEL spans cannot differentiate between `sampled=false` (sampled to be not recorded) and `sampled=undefined` (no sampling decision yet). ## Update to `continueTrace` & trace propagation While my first instinct was to ensure that in the trace methods, if we have remote span we ignore it and look at the propagation context, this has a bunch of problems - because it means we can run out of sync, if this is set from outside, etc. So instead, I now provide a custom `continueTrace` method from `@sentry/opentelemetry` & `@sentry/node` which should be used instead of the core one in meta SDKs. This method will, in addition to updating the propagation context, _also_ create a remote span with the passed in data, and make it the active span in the callback. Then, I updated the otel start span APIs to always use that, if it exists (which was already the behavior we had), PLUS also added behavior that if there is no active span at all (not even a remote one), _then_ we look at the propagation context. ## Update to sampling inheritance Previously, we basically did the following: ```ts const sampled: Boolean | undefined = spanContext.traceFlags === TraceFlags.SAMPLED; // this will always be true or false, never undefined ``` Which means that if we create a remote span from a minimal propagation context: ```ts // This could be a generated propagation context from a scope const propagationContext = { spanId: 'xxx', traceId: 'yyy' }; const spanContext: SpanContext = { sampled: propagationContext.sampled ? TraceFlags.SAMPLED : TraceFlags.NONE, } ``` We would later always get `sampled: false`, and inherit this decision for all downstream spans - instead of treating it as `undefined`, and going through the sampler, as we actually want it to. In order to "solve" this, I added a new trace state `SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING`, which we set if we _know_ this is actually `sampled: false`, and not just unset. Then, based on this we can interpret `sampled` as being `false` or `undefined`, respectively. This is a bit hacky but should work - it means that if we get a sampling decision from outside we'll treat it as `undefined`, which is OK I would say. Our own sampler will set this correctly so we inherit correctly as well, and our propagator does so too. --------- Co-authored-by: Lukas Stracke <[email protected]>
1 parent a76e4a2 commit 0f1ea2f

File tree

16 files changed

+603
-79
lines changed

16 files changed

+603
-79
lines changed

packages/astro/src/index.types.ts

+2-5
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,8 @@ export declare const getActiveSpan: typeof clientSdk.getActiveSpan;
2828
// eslint-disable-next-line deprecation/deprecation
2929
export declare const getCurrentHub: typeof clientSdk.getCurrentHub;
3030
export declare const getClient: typeof clientSdk.getClient;
31-
export declare const startSpan: typeof clientSdk.startSpan;
32-
export declare const startInactiveSpan: typeof clientSdk.startInactiveSpan;
33-
export declare const startSpanManual: typeof clientSdk.startSpanManual;
34-
export declare const withActiveSpan: typeof clientSdk.withActiveSpan;
35-
export declare const getRootSpan: typeof clientSdk.getRootSpan;
31+
export declare const continueTrace: typeof clientSdk.continueTrace;
32+
3633
export declare const Span: clientSdk.Span;
3734

3835
export declare const metrics: typeof clientSdk.metrics & typeof serverSdk.metrics;

packages/core/src/utils/spanUtils.ts

+1-2
Original file line numberDiff line numberDiff line change
@@ -162,8 +162,7 @@ export function spanIsSampled(span: Span): boolean {
162162
// We align our trace flags with the ones OpenTelemetry use
163163
// So we also check for sampled the same way they do.
164164
const { traceFlags } = span.spanContext();
165-
// eslint-disable-next-line no-bitwise
166-
return Boolean(traceFlags & TRACE_FLAG_SAMPLED);
165+
return traceFlags === TRACE_FLAG_SAMPLED;
167166
}
168167

169168
/** Get the status message to use for a JSON representation of a span. */

packages/node-experimental/src/index.ts

+4-1
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,10 @@ export {
4747
extractRequestData,
4848
} from '@sentry/utils';
4949

50+
// These are custom variants that need to be used instead of the core one
51+
// As they have slightly different implementations
52+
export { continueTrace } from '@sentry/opentelemetry';
53+
5054
export {
5155
addBreadcrumb,
5256
isInitialized,
@@ -78,7 +82,6 @@ export {
7882
setCurrentClient,
7983
Scope,
8084
setMeasurement,
81-
continueTrace,
8285
getSpanDescendants,
8386
parameterize,
8487
getCurrentScope,

packages/node-experimental/test/integration/scope.test.ts

+8-2
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ describe('Integration | Scope', () => {
6565
trace: {
6666
span_id: spanId,
6767
trace_id: traceId,
68+
// local span ID from propagation context
69+
...(enableTracing ? { parent_span_id: expect.any(String) } : undefined),
6870
},
6971
}),
7072
}),
@@ -110,6 +112,8 @@ describe('Integration | Scope', () => {
110112
status: 'ok',
111113
trace_id: traceId,
112114
origin: 'manual',
115+
// local span ID from propagation context
116+
parent_span_id: expect.any(String),
113117
},
114118
}),
115119
spans: [],
@@ -194,7 +198,8 @@ describe('Integration | Scope', () => {
194198
? {
195199
span_id: spanId1,
196200
trace_id: traceId1,
197-
parent_span_id: undefined,
201+
// local span ID from propagation context
202+
...(enableTracing ? { parent_span_id: expect.any(String) } : undefined),
198203
}
199204
: expect.any(Object),
200205
}),
@@ -220,7 +225,8 @@ describe('Integration | Scope', () => {
220225
? {
221226
span_id: spanId2,
222227
trace_id: traceId2,
223-
parent_span_id: undefined,
228+
// local span ID from propagation context
229+
...(enableTracing ? { parent_span_id: expect.any(String) } : undefined),
224230
}
225231
: expect.any(Object),
226232
}),

packages/node-experimental/test/integration/transactions.test.ts

+4
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,8 @@ describe('Integration | Transactions', () => {
267267
status: 'ok',
268268
trace_id: expect.any(String),
269269
origin: 'auto.test',
270+
// local span ID from propagation context
271+
parent_span_id: expect.any(String),
270272
},
271273
}),
272274
spans: [expect.any(Object), expect.any(Object)],
@@ -312,6 +314,8 @@ describe('Integration | Transactions', () => {
312314
status: 'ok',
313315
trace_id: expect.any(String),
314316
origin: 'manual',
317+
// local span ID from propagation context
318+
parent_span_id: expect.any(String),
315319
},
316320
}),
317321
spans: [expect.any(Object), expect.any(Object)],

packages/opentelemetry/src/constants.ts

+3-1
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,10 @@ import { createContextKey } from '@opentelemetry/api';
22

33
export const SENTRY_TRACE_HEADER = 'sentry-trace';
44
export const SENTRY_BAGGAGE_HEADER = 'baggage';
5-
export const SENTRY_TRACE_STATE_DSC = 'sentry.trace';
5+
6+
export const SENTRY_TRACE_STATE_DSC = 'sentry.dsc';
67
export const SENTRY_TRACE_STATE_PARENT_SPAN_ID = 'sentry.parent_span_id';
8+
export const SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING = 'sentry.sampled_not_recording';
79

810
export const SENTRY_SCOPES_CONTEXT_KEY = createContextKey('sentry_scopes');
911

packages/opentelemetry/src/index.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ export {
2121
export { isSentryRequestSpan } from './utils/isSentryRequest';
2222

2323
export { getActiveSpan } from './utils/getActiveSpan';
24-
export { startSpan, startSpanManual, startInactiveSpan, withActiveSpan } from './trace';
24+
export { startSpan, startSpanManual, startInactiveSpan, withActiveSpan, continueTrace } from './trace';
2525

2626
// eslint-disable-next-line deprecation/deprecation
2727
export { setupGlobalHub } from './custom/hub';

packages/opentelemetry/src/propagator.ts

+100-25
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
import type { Baggage, Context, SpanContext, TextMapGetter, TextMapSetter } from '@opentelemetry/api';
2+
import { context } from '@opentelemetry/api';
23
import { TraceFlags, propagation, trace } from '@opentelemetry/api';
34
import { TraceState, W3CBaggagePropagator, isTracingSuppressed } from '@opentelemetry/core';
5+
import type { continueTrace } from '@sentry/core';
46
import { getClient, getCurrentScope, getDynamicSamplingContextFromClient, getIsolationScope } from '@sentry/core';
57
import type { DynamicSamplingContext, PropagationContext } from '@sentry/types';
68
import {
@@ -16,18 +18,20 @@ import {
1618
SENTRY_TRACE_HEADER,
1719
SENTRY_TRACE_STATE_DSC,
1820
SENTRY_TRACE_STATE_PARENT_SPAN_ID,
21+
SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING,
1922
} from './constants';
2023
import { getScopesFromContext, setScopesOnContext } from './utils/contextData';
2124
import { setIsSetup } from './utils/setupCheck';
2225

2326
/** Get the Sentry propagation context from a span context. */
2427
export function getPropagationContextFromSpanContext(spanContext: SpanContext): PropagationContext {
25-
const { traceId, spanId, traceFlags, traceState } = spanContext;
28+
const { traceId, spanId, traceState } = spanContext;
2629

2730
const dscString = traceState ? traceState.get(SENTRY_TRACE_STATE_DSC) : undefined;
2831
const dsc = dscString ? baggageHeaderToDynamicSamplingContext(dscString) : undefined;
2932
const parentSpanId = traceState ? traceState.get(SENTRY_TRACE_STATE_PARENT_SPAN_ID) : undefined;
30-
const sampled = traceFlags === TraceFlags.SAMPLED;
33+
34+
const sampled = getSamplingDecision(spanContext);
3135

3236
return {
3337
traceId,
@@ -78,32 +82,18 @@ export class SentryPropagator extends W3CBaggagePropagator {
7882
*/
7983
public extract(context: Context, carrier: unknown, getter: TextMapGetter): Context {
8084
const maybeSentryTraceHeader: string | string[] | undefined = getter.get(carrier, SENTRY_TRACE_HEADER);
81-
const maybeBaggageHeader = getter.get(carrier, SENTRY_BAGGAGE_HEADER);
85+
const baggage = getter.get(carrier, SENTRY_BAGGAGE_HEADER);
8286

83-
const sentryTraceHeader = maybeSentryTraceHeader
87+
const sentryTrace = maybeSentryTraceHeader
8488
? Array.isArray(maybeSentryTraceHeader)
8589
? maybeSentryTraceHeader[0]
8690
: maybeSentryTraceHeader
8791
: undefined;
8892

89-
const propagationContext = propagationContextFromHeaders(sentryTraceHeader, maybeBaggageHeader);
90-
91-
// We store the DSC as OTEL trace state on the span context
92-
const traceState = makeTraceState({
93-
parentSpanId: propagationContext.parentSpanId,
94-
dsc: propagationContext.dsc,
95-
});
96-
97-
const spanContext: SpanContext = {
98-
traceId: propagationContext.traceId,
99-
spanId: propagationContext.parentSpanId || '',
100-
isRemote: true,
101-
traceFlags: propagationContext.sampled === true ? TraceFlags.SAMPLED : TraceFlags.NONE,
102-
traceState,
103-
};
93+
const propagationContext = propagationContextFromHeaders(sentryTrace, baggage);
10494

10595
// Add remote parent span context,
106-
const ctxWithSpanContext = trace.setSpanContext(context, spanContext);
96+
const ctxWithSpanContext = getContextWithRemoteActiveSpan(context, { sentryTrace, baggage });
10797

10898
// Also update the scope on the context (to be sure this is picked up everywhere)
10999
const scopes = getScopesFromContext(ctxWithSpanContext);
@@ -128,8 +118,13 @@ export class SentryPropagator extends W3CBaggagePropagator {
128118
export function makeTraceState({
129119
parentSpanId,
130120
dsc,
131-
}: { parentSpanId?: string; dsc?: Partial<DynamicSamplingContext> }): TraceState | undefined {
132-
if (!parentSpanId && !dsc) {
121+
sampled,
122+
}: {
123+
parentSpanId?: string;
124+
dsc?: Partial<DynamicSamplingContext>;
125+
sampled?: boolean;
126+
}): TraceState | undefined {
127+
if (!parentSpanId && !dsc && sampled !== false) {
133128
return undefined;
134129
}
135130

@@ -140,7 +135,11 @@ export function makeTraceState({
140135
? new TraceState().set(SENTRY_TRACE_STATE_PARENT_SPAN_ID, parentSpanId)
141136
: new TraceState();
142137

143-
return dscString ? traceStateBase.set(SENTRY_TRACE_STATE_DSC, dscString) : traceStateBase;
138+
const traceStateWithDsc = dscString ? traceStateBase.set(SENTRY_TRACE_STATE_DSC, dscString) : traceStateBase;
139+
140+
// We also specifically want to store if this is sampled to be not recording,
141+
// or unsampled (=could be either sampled or not)
142+
return sampled === false ? traceStateWithDsc.set(SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING, '1') : traceStateWithDsc;
144143
}
145144

146145
function getInjectionData(context: Context): {
@@ -161,7 +160,7 @@ function getInjectionData(context: Context): {
161160
dynamicSamplingContext,
162161
traceId: spanContext.traceId,
163162
spanId: spanContext.spanId,
164-
sampled: spanContext.traceFlags === TraceFlags.SAMPLED,
163+
sampled: getSamplingDecision(spanContext),
165164
};
166165
}
167166

@@ -188,7 +187,7 @@ function getInjectionData(context: Context): {
188187
dynamicSamplingContext,
189188
traceId: spanContext.traceId,
190189
spanId: spanContext.spanId,
191-
sampled: spanContext.traceFlags === TraceFlags.SAMPLED,
190+
sampled: getSamplingDecision(spanContext),
192191
};
193192
}
194193

@@ -221,3 +220,79 @@ function getDynamicSamplingContext(
221220

222221
return undefined;
223222
}
223+
224+
function getContextWithRemoteActiveSpan(
225+
ctx: Context,
226+
{ sentryTrace, baggage }: Parameters<typeof continueTrace>[0],
227+
): Context {
228+
const propagationContext = propagationContextFromHeaders(sentryTrace, baggage);
229+
230+
// We store the DSC as OTEL trace state on the span context
231+
const traceState = makeTraceState({
232+
parentSpanId: propagationContext.parentSpanId,
233+
dsc: propagationContext.dsc,
234+
sampled: propagationContext.sampled,
235+
});
236+
237+
const spanContext: SpanContext = {
238+
traceId: propagationContext.traceId,
239+
spanId: propagationContext.parentSpanId || '',
240+
isRemote: true,
241+
traceFlags: propagationContext.sampled ? TraceFlags.SAMPLED : TraceFlags.NONE,
242+
traceState,
243+
};
244+
245+
return trace.setSpanContext(ctx, spanContext);
246+
}
247+
248+
/**
249+
* Takes trace strings and propagates them as a remote active span.
250+
* This should be used in addition to `continueTrace` in OTEL-powered environments.
251+
*/
252+
export function continueTraceAsRemoteSpan<T>(
253+
ctx: Context,
254+
options: Parameters<typeof continueTrace>[0],
255+
callback: () => T,
256+
): T {
257+
const ctxWithSpanContext = getContextWithRemoteActiveSpan(ctx, options);
258+
259+
return context.with(ctxWithSpanContext, callback);
260+
}
261+
262+
/**
263+
* OpenTelemetry only knows about SAMPLED or NONE decision,
264+
* but for us it is important to differentiate between unset and unsampled.
265+
*
266+
* Both of these are identified as `traceFlags === TracegFlags.NONE`,
267+
* but we additionally look at a special trace state to differentiate between them.
268+
*/
269+
export function getSamplingDecision(spanContext: SpanContext): boolean | undefined {
270+
const { traceFlags, traceState } = spanContext;
271+
272+
const sampledNotRecording = traceState ? traceState.get(SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING) === '1' : false;
273+
274+
// If trace flag is `SAMPLED`, we interpret this as sampled
275+
// If it is `NONE`, it could mean either it was sampled to be not recorder, or that it was not sampled at all
276+
// For us this is an important difference, sow e look at the SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING
277+
// to identify which it is
278+
if (traceFlags === TraceFlags.SAMPLED) {
279+
return true;
280+
}
281+
282+
if (sampledNotRecording) {
283+
return false;
284+
}
285+
286+
// Fall back to DSC as a last resort, that may also contain `sampled`...
287+
const dscString = traceState ? traceState.get(SENTRY_TRACE_STATE_DSC) : undefined;
288+
const dsc = dscString ? baggageHeaderToDynamicSamplingContext(dscString) : undefined;
289+
290+
if (dsc?.sampled === 'true') {
291+
return true;
292+
}
293+
if (dsc?.sampled === 'false') {
294+
return false;
295+
}
296+
297+
return undefined;
298+
}

packages/opentelemetry/src/sampler.ts

+9-4
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
1-
/* eslint-disable no-bitwise */
21
import type { Attributes, Context, SpanContext } from '@opentelemetry/api';
3-
import { TraceFlags, isSpanContextValid, trace } from '@opentelemetry/api';
2+
import { isSpanContextValid, trace } from '@opentelemetry/api';
3+
import { TraceState } from '@opentelemetry/core';
44
import type { Sampler, SamplingResult } from '@opentelemetry/sdk-trace-base';
55
import { SamplingDecision } from '@opentelemetry/sdk-trace-base';
66
import { SEMANTIC_ATTRIBUTE_SENTRY_SAMPLE_RATE, hasTracingEnabled } from '@sentry/core';
77
import type { Client, ClientOptions, SamplingContext } from '@sentry/types';
88
import { isNaN, logger } from '@sentry/utils';
9+
import { SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING } from './constants';
910

1011
import { DEBUG_BUILD } from './debug-build';
11-
import { getPropagationContextFromSpanContext } from './propagator';
12+
import { getPropagationContextFromSpanContext, getSamplingDecision } from './propagator';
1213
import { setIsSetup } from './utils/setupCheck';
1314

1415
/**
@@ -38,6 +39,7 @@ export class SentrySampler implements Sampler {
3839
}
3940

4041
const parentContext = trace.getSpanContext(context);
42+
const traceState = parentContext?.traceState || new TraceState();
4143

4244
let parentSampled: boolean | undefined = undefined;
4345

@@ -49,7 +51,7 @@ export class SentrySampler implements Sampler {
4951
DEBUG_BUILD &&
5052
logger.log(`[Tracing] Inheriting remote parent's sampled decision for ${spanName}: ${parentSampled}`);
5153
} else {
52-
parentSampled = Boolean(parentContext.traceFlags & TraceFlags.SAMPLED);
54+
parentSampled = getSamplingDecision(parentContext);
5355
DEBUG_BUILD && logger.log(`[Tracing] Inheriting parent's sampled decision for ${spanName}: ${parentSampled}`);
5456
}
5557
}
@@ -76,6 +78,7 @@ export class SentrySampler implements Sampler {
7678
return {
7779
decision: SamplingDecision.NOT_RECORD,
7880
attributes,
81+
traceState: traceState.set(SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING, '1'),
7982
};
8083
}
8184

@@ -93,6 +96,7 @@ export class SentrySampler implements Sampler {
9396
return {
9497
decision: SamplingDecision.NOT_RECORD,
9598
attributes,
99+
traceState: traceState.set(SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING, '1'),
96100
};
97101
}
98102

@@ -112,6 +116,7 @@ export class SentrySampler implements Sampler {
112116
return {
113117
decision: SamplingDecision.NOT_RECORD,
114118
attributes,
119+
traceState: traceState.set(SENTRY_TRACE_STATE_SAMPLED_NOT_RECORDING, '1'),
115120
};
116121
}
117122

packages/opentelemetry/src/spanProcessor.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ function onSpanStart(span: Span, parentContext: Context): void {
2020
let scopes = getScopesFromContext(parentContext);
2121

2222
// We need access to the parent span in order to be able to move up the span tree for breadcrumbs
23-
if (parentSpan) {
23+
if (parentSpan && !parentSpan.spanContext().isRemote) {
2424
addChildSpanToSpan(parentSpan, span);
2525
}
2626

0 commit comments

Comments
 (0)