Can you provide an example of a visual language model or multimodal model launch by triton server?

there is an example https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/qwenvl , but I have no idea how can I use this model in triton server, Can you provide an example of a visual language model or multimodal model?