Skip to content

Serverless Distributed Tracing: Trace Extractor/Propagation for batched events #317

Open
@lucashfreitas

Description

@lucashfreitas

I am working with a serverless event drive architecture that uses Event Bridge, SQS, and Lambda:

  1. The lambda function (wrapped by datadog-cdk construct) pushes message to Event Bridge.
  2. The event bridge has SQS queues as targets and forward the messages to it.
  3. Lambda function (wrapped by datadog-cdk construct) consumes the message from the SQS queue and sends them into Event Bus again and we move back to step 1.

Our goal is to enable end-to-end traces for this architecture.

1. We wrapped all lambdas (publishers and consumers) with datadog-cdk construct but this produced multiple disconnected traces:

Following this documentation https://docs.datadoghq.com/serverless/distributed_tracing/serverless_trace_propagation/?tab=nodejs, I would expect that the trace propagation happens automatically as mentioned here:

Tracing many AWS Managed services (listed here) is supported out-of-the-box and does not require following the steps outlined on this page.

But the traces are not being associated/propagated and I am seeing multiple disconnected traces - not sure if this happens because event bridge invokes the lambda asynchronously, so maybe we really need to "manually" extract the traceContext and pass it through the _datadog field in the event bus.

2. We have implemented a manual trace extractor propagation following datadog documentation:

We have implemented a manual trace propagation following the docs/tutorial https://docs.datadoghq.com/serverless/distributed_tracing/serverless_trace_propagation/?tab=nodejs here and we managed to connect the tracing, but we are now facing another issue to handle/propagate trace for batched events on Lambda functions.

All the examples/docs for trace extraction, even the handler wrapper provided by this library expect to return a single trace per lambda function.

import {datadog} form "datadog-lambda-js"

const lambdaHandler = (event, context) => {
 //my lambda handler
}

export const handler = datadog(handler, {
traceExtractor: (event, context) => {
//datadog expects to return a single trace data here.
}}

If we decide to export a file on the function and set the DD_TRACE_EXTRACTOR we also return a single object.

The issue is that our lambda function actually handles a batch of events coming from an SQS queue (10+) and each of those events might have a different trace context but we are not sure how to handle this using this library or perhaps we should manually use dd-trace library to automatically create the trace and send it to datalog for each event in the batch.

Can someone help or provide if that's not possible to achieve using this library and we really need to use dd-trace to manually create and send the trace to datadog?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions