-
Notifications
You must be signed in to change notification settings - Fork 121
Issues: triton-inference-server/tensorrtllm_backend
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Question] Understanding Generation Logits & Context Logits
question
Further information is requested
triaged
Issue has been triaged by maintainers
#530
opened Jul 12, 2024 by
here4dadata
Invalid argument: unable to find backend library for backend '${triton_backend}'
triaged
Issue has been triaged by maintainers
#526
opened Jul 9, 2024 by
chenchunhui97
2 of 4 tasks
How to solve the problem of errors when loading qwen1.5-7B (using two GPUs) and llama3-8B (using two GPUs) models simultaneously using tritonserver?
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#510
opened Jun 21, 2024 by
ChengShuting
2 of 4 tasks
ailed to read text proto from tensorrtllm_backend/triton_model_repo/tensorrt_llm/config.pbtxt
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#501
opened Jun 17, 2024 by
alokkrsahu
4 tasks
Multiple outputs in sampling
question
Further information is requested
triaged
Issue has been triaged by maintainers
#499
opened Jun 17, 2024 by
tonylek
2x docker image size increase for trtllm: from 8.38 GB (24.03) to 18.46 GB (24.04)
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#489
opened Jun 3, 2024 by
lopuhin
2 of 4 tasks
[Question] Best practises to track inputs and predictions?
question
Further information is requested
triaged
Issue has been triaged by maintainers
#475
opened May 24, 2024 by
FernandoDorado
random_seed
seems to be ignored (or at least inconsistent) for inflight_batcher_llm
bug
#468
opened May 21, 2024 by
dyoshida-continua
2 of 4 tasks
unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'name' not found
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#467
opened May 21, 2024 by
Godlovecui
2 of 4 tasks
Can you provide an example of a visual language model or multimodal model launch by triton server?
triaged
Issue has been triaged by maintainers
#463
opened May 20, 2024 by
lzcchl
How to deploy one model instance across multiple GPUs to tackle the OOM problem?
question
Further information is requested
triaged
Issue has been triaged by maintainers
#462
opened May 16, 2024 by
shil3754
decoding_mode top_k_top_p does not take effect for llama2 not same with huggingface
triaged
Issue has been triaged by maintainers
#461
opened May 16, 2024 by
yjjiang11
1 of 4 tasks
Tritonserver won't start up running Smaug 34b
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#459
opened May 15, 2024 by
workuser12345
2 of 4 tasks
two seemingly identical functions in the same file
triaged
Issue has been triaged by maintainers
#458
opened May 15, 2024 by
dongluw
Replace subprocess.Popen with subprocess.run
triaged
Issue has been triaged by maintainers
#452
opened May 14, 2024 by
rlempka
Loading…
[tensorrt-llm backend] A question about launch_triton_server.py
question
Further information is requested
triaged
Issue has been triaged by maintainers
#455
opened May 13, 2024 by
victorsoda
InFlightBatching seems not working
need more info
triaged
Issue has been triaged by maintainers
#442
opened May 6, 2024 by
larme
2 of 4 tasks
Deployement failed for BERT
triaged
Issue has been triaged by maintainers
#440
opened May 3, 2024 by
vivekjoshi556
Deploying Mixtral-8x7B-v0.1 with Triton 24.02 on A100 (160GB) raises "Cuda Runtime (out of memory)" exception
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#438
opened Apr 29, 2024 by
kelkarn
2 of 4 tasks
max_batch_size
seems to have no impact on model performance
bug
#429
opened Apr 23, 2024 by
VitalyPetrov
3 of 4 tasks
Performance Issue with return_context_logits Enabled in TensorRT-LLM
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#419
opened Apr 19, 2024 by
metterian
2 of 4 tasks
Filtering beam_search output tensors results in a string output vs list
triaged
Issue has been triaged by maintainers
#418
opened Apr 18, 2024 by
nikhilshandilya
Warmup Example of loading LoRa weights
triaged
Issue has been triaged by maintainers
#417
opened Apr 18, 2024 by
TheCodeWrangler
the result use inflight_batcher_llm_client to send multiple lora weights is not same as use tensorrtllm
triaged
Issue has been triaged by maintainers
#413
opened Apr 17, 2024 by
stifles
Block reuse is currently not supported with beam width > 1
triaged
Issue has been triaged by maintainers
#411
opened Apr 16, 2024 by
tonylek
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.