Skip to content

Add some infrastructure for Observation #3879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

artembilan
Copy link
Member

  • Populate an ObservationRegistry bean from the IntegrationManagementConfigurer
    into all the IntegrationManagement components
  • Introduce MessageReceiverContext and MessageSenderContext for easier usage
    in the target code
  • Implement Observation handling in the AbstractMessageHandler
  • Modify ObservationPropagationChannelInterceptorTests for new MessageSenderContext
  • Use BridgeHandler to ensure that Observation is propagated and handled properly
  • Verify that tags from the AbstractMessageHandler are preset on the consumer span

@artembilan
Copy link
Member Author

/CC @marcingrzejszczak , @jonatan-ivanov

Copy link
Contributor

@marcingrzejszczak marcingrzejszczak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I've added some small comments.

Things to consider / test

  • Use SampleTestRunner to see how the spans will look like in Zipkin / Wavefront
  • If we have multiple channels via which we are pushing a message, do we want each passing via a channel to be a separate span? For example, if we get a message from rabbit, then we go through 10 different other processing via different channels, and then we push the message out of process to kafka, then should we 12 spans, or 1 span for input, 1 for processing and 1 for output ?

public IntegrationManagementConfigurer managementConfigurer(ObjectProvider<MetricsCaptor> metricsCaptorProvider) {
public IntegrationManagementConfigurer managementConfigurer(
ObjectProvider<MetricsCaptor> metricsCaptorProvider,
ObjectProvider<ObservationRegistry> observationRegistryProvider) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that micrometer-observation is a mandatory dependency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what we have discussed with you a couple weeks ago.
The conclusion was like: since a micrometer-observation is tiny API jar there is no reason in extra facade layer complexity on our side.
The bean by itself is optional though, so we just don't have any observation by default either way.

}
}

private void handleWithObservation(Message<?> message, ObservationRegistry observationRegistry) {
Observation.createNotStarted(CONSUME_OBSERVATION_NAME, new MessageReceiverContext(message), observationRegistry)
.lowCardinalityKeyValue("type", "handler")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We suggest moving this out to use DocumentedObservation together with ObservationConvention. That way you give users an option to control the key values, naming etc. + you can get the documentation generated automatically

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds interesting, but we definitely need to come up with that convention for end-users 😄 .

I think these couple keys I have here so far are mandatory and really specific to component.
I really design an Observation usage similar to what we have so far with Timer here or there.

The DocumentedObservation looks like - will investigate that one.

Thank you!

@artembilan
Copy link
Member Author

I'm not familiar with Zipkin / Wavefront so either view won't make too much sense for me 😄

Will look into that when I come up with some observation for message channels.

And answering to your last question: indeed there have to be 12 spans (at least). I plan to observe message channels as well - for now just it is place where we handle timers: AbstractMessageChannel and MessagingGatewaySupport (not talking about MessageHandlerSupport which is handled by this PR).

Thank you for review!
Will go to learn a DocumentedObservation...

artembilan added a commit to artembilan/spring-integration that referenced this pull request Sep 7, 2022
Fixes spring-projects#3879

* Change the `LambdaMessageProcessor` to invoke `Function`, `GenericHandler`,
`GenericSelector` & `GenericTransformer` explicitly without reflection.
This allows to avoid `--add-opens` for Java types to be used from reflections.
Plus this removes some overhead with native images.
And in general it is faster than reflection invocation.
@artembilan artembilan force-pushed the Observation_Infrastructure branch from b0e844d to bd60fac Compare September 8, 2022 23:28
@artembilan
Copy link
Member Author

Pushed some DocumentedObservation API impls.

*
* @since 6.0
*/
public final class IntegrationObservations {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you see I had to create this factory for better target usage.
I really don't see reason in the DefaultMessageReceiverObservationConvention instance per request.
And I don't feel yet that I need some customConvention.

It probably would be great if getDefaultConvention() would return an instance instead of Class and this one would always be used in case of plain observation(registry, context) call.

@marcingrzejszczak , you mentioned in Spring AMQP PR that we always have to use a convention for contextual name propagation, but as you see it is a bit involved in the target code since we have to call the observation() method with much more arguments, than just rely on a default convention by default...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't see reason in the DefaultMessageReceiverObservationConvention instance per request.

We typically create singleton fields for those in instrumentations.

And I don't feel yet that I need some customConvention.

You don't, but the users might ;) What if they want to follow their internal naming convention and you don't give them such an option?

It probably would be great if getDefaultConvention() would return an instance instead of Class and this one would always be used in case of plain observation(registry, context) call.

That will be harder to parse for the micrometer docs generator.

@marcingrzejszczak , you mentioned in spring-projects/spring-amqp#1500 (comment) that we always have to use a convention for contextual name propagation, but as you see it is a bit involved in the target code since we have to call the observation() method with much more arguments, than just rely on a default convention by default...

Yes, I mentioned that it would be preferable to use the ObservationConvention to name observations either via contextual or non-contextual methods.

The way you did it ATM doesn't allow to set any customizations by the users at the level of Spring Integration. Those can be still done by GlobalObservationConvention and the ObservationRegistry (which may be OK if you're OK with that). That means that if I want to change the tags or naming for Spring Integration then I can ONLY do it by registering a proper GlobalObservationConvention and attaching it to the ObservationRegistry. The other option would be to give the user a setter to provide a custom implementation of your MessageRecevierObservationConvention. It's up to you to decide (e.g. Reactor Netty does not allow any customizations).

@rishiraj88
Copy link

👍

COMPONENT_NAME {
@Override
public String asString() {
return "name";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the full spring.integration.handler.name wouldn't be better (same for type). Imagine that you're in zipkin and you're searching for a tag name. You can have a lot of different names but you're looking specifically for a handler name. WDYT ?

@artembilan artembilan force-pushed the Observation_Infrastructure branch from 2a64a09 to 91d5776 Compare September 12, 2022 15:42
@artembilan
Copy link
Member Author

OK. Pushed something to incorporate your comments.

Thank you!

* Populate an `ObservationRegistry` bean from the `IntegrationManagementConfigurer`
into all the `IntegrationManagement` components
* Introduce `MessageReceiverContext` and `MessageSenderContext` for easier usage
in the target code
* Implement `Observation` handling in the `AbstractMessageHandler`
* Modify `ObservationPropagationChannelInterceptorTests` for new `MessageSenderContext`
* Use `BridgeHandler` to ensure that `Observation` is propagated and handled properly
* Verify that tags from the `AbstractMessageHandler` are preset on the consumer span
…andler`

* Use more meaningful prefix for Spring Integration tags
@artembilan artembilan force-pushed the Observation_Infrastructure branch from 91d5776 to af7a26b Compare September 12, 2022 16:34
@marcingrzejszczak
Copy link
Contributor

It still would be super beneficial to use the SampleTestRunner with Zipkin to check how the whole trace looks like - https://micrometer.io/docs/tracing#_running_integration_tests

…ntion`

into `DefaultMessageReceiverObservationConvention` per se as an `INSTANCE` constant
* Use `MeterRegistryAssert` in the `ObservationPropagationChannelInterceptorTests`
to verify meters emitted
@artembilan
Copy link
Member Author

@marcingrzejszczak ,

does this look like you'd expect?

image

Sorry, I'm not familiar with Zipkin and what it could show me 😄


@Override
public TracingSetup[] getTracingSetup() {
return new TracingSetup[]{ TracingSetup.IN_MEMORY_BRAVE, TracingSetup.ZIPKIN_BRAVE };
Copy link
Contributor

@marcingrzejszczak marcingrzejszczak Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you leaving out OTel ? BTW the Wavefront setup will be automatically ignored cause you didn't provide any tokens.

Do you have a screenshot from Zipkin? I'm super interested in how it looks like :)

EDIT: I see the screenshot above, sorry

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should I care about OTel?
It is really not Spring Integration responsibility to check the target tracing protocol.
I have even excluded OTel and Wavefront dependencies: it is not my project responsibility to test Observation low-level libraries.
I think I just have to believe to your asserts and that's it 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should I care about OTel?

Because we're providing 2 tracers for this SampleTestRunner and we want to be sure that using both of them things are looking good.

It is really not Spring Integration responsibility to check the target tracing protocol.

You can also just write unit tests with mocks - it's all about being more certain that things are working fine

I have even excluded OTel and Wavefront dependencies: it is not my project responsibility to test Observation low-level libraries.

SampleTestRunner is about giving you more confidence that things are looking good with tracing visualization tools. If I was a Spring Integration project maintainer then I would like to be pretty sure that things are looking good regardless of the used tracer. Also I would like to be sure that things are looking good in Wavefront (have you checked that?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this sounds like testing a RestTemplate against different HTTP client impls, which is really not my project responsibility where I should worry proper RestTemplate calls from my code. How RestTemplate interacts with a provided HTTP client is really outside of my project scope.
OTel, Wavefront, Zipkin - all of these are new for me and I need to read about them more.
Meanwhile I still don't see what and how I would fix in my code if something is not OK with those techs. Just because I don't do anything from my code to affect it somehow.
Doesn't look like it is similar to RDBMS where each vendor may have its own specific SQL extensions we should care about.

If you still insist on enabling them, I'll revert my restrictions, but, please, be my guest to help me to determine "that things are looking good" 😄

@marcingrzejszczak
Copy link
Contributor

does this look like you'd expect?

You tell me :D So by looking at the graph I see that via Spring Integration we've managed to send a message from one channel and receive it on the other one. It took close to no time to process it.

If you were to debug latency issues or production bugs - would this information be useful? Would you need more / less (would you add more tags) ?

@artembilan
Copy link
Member Author

we've managed to send a message from one channel and receive it on the other one.

Well, it's a bit complicated than you think and it is not finished from the framework perspective yet.

So, I wrap sending into a start observation - this part will be implemented eventually in the MessageChannel.send(). (Was a bit surprised though to see extra parent span: the sample runner does that for us 😄 )
Then a bit propagation magic happens. My channel under test is a QueueChannel where consumption from it happens on another, scheduled thread. I transfer an Observation via specific message property.
On that consumer thread I restore the Observation into current scope. See my ObservationPropagationChannelInterceptor.
The this message is handled by the BridgeHandler and we see a span for it contributed by the infrastructure introduced in this PR.

We indeed see "no time to process" because there is really no latency: producer just place a message into a queue and close its observation.
The consumer happens on another thread and just replies that message for assertion.

I don't have a production experience with tracing to judge how is this useful.
However I'd like to see some visual connection between producer and consumer kinds...

Thank you for looking into this anyway!

@marcingrzejszczak
Copy link
Contributor

So, I wrap sending into a start observation - this part will be implemented eventually in the MessageChannel.send(). (Was a bit surprised though to see extra parent span: the sample runner does that for us smile )

Yeah, it does :)

However I'd like to see some visual connection between producer and consumer kinds...

If in Zipkin you click the SHOW ALL ANNOTATIONS button you will see 2 annotations. One will be that a PRODUCER start and stop events and then on the recipient side you will have the same but for a CONSUMER

@artembilan
Copy link
Member Author

SHOW ALL ANNOTATIONS

Yep! Saw them. I worry more about if I'm going to have many producer and consumer spans in my trace. It seems there has to be a parent-child connection between them.
Now imaging I use publish-subscribe. So, one producer span an many consumer spans. That might be great to see a connection between them. Perhaps not a parent-child, but like a source span since with an async handling we may stop the parent far before those other threads handles their own spans...

@garyrussell
Copy link
Contributor

I wonder if you should first concentrate on instrumenting just the in/outbound endpoints.

While it might be interesting to trace the entire flow within the app, I doubt people will want that volume of tracing all the time, perhaps they might just want to enable it to see which component(s) in a flow are the bottleneck.

@marcingrzejszczak
Copy link
Contributor

It seems there has to be a parent-child connection between them.

There is a parent child relationship because the spans are nested below each other. Look to the left that you can fold the child spans of test send. In this case you have only 1 so there's not much to fold :)

@artembilan
Copy link
Member Author

Well, Gary, the point is that channel could be not only publish-subscribe, but also persistent and distributed, so just a single in and single out for the application is not correct assumption since the whole integration solution could be simply distributed.
Therefore it feels legit to have a MessageChannel traced as well.

@marcingrzejszczak ,

any other concerns for this PR, you'd like me to address?
Probably this is ready for merge and we are good to go ahead with other components instrumentation in other PRs?

@marcingrzejszczak
Copy link
Contributor

Yes, looks good to me!

@garyrussell garyrussell merged commit 8c73a2d into spring-projects:main Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants