Skip to content

Issues: triton-inference-server/tensorrtllm_backend

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Question] Understanding Generation Logits & Context Logits question Further information is requested triaged Issue has been triaged by maintainers
#530 opened Jul 12, 2024 by here4dadata
Invalid argument: unable to find backend library for backend '${triton_backend}' triaged Issue has been triaged by maintainers
#526 opened Jul 9, 2024 by chenchunhui97
2 of 4 tasks
ailed to read text proto from tensorrtllm_backend/triton_model_repo/tensorrt_llm/config.pbtxt bug Something isn't working triaged Issue has been triaged by maintainers
#501 opened Jun 17, 2024 by alokkrsahu
4 tasks
Multiple outputs in sampling question Further information is requested triaged Issue has been triaged by maintainers
#499 opened Jun 17, 2024 by tonylek
2x docker image size increase for trtllm: from 8.38 GB (24.03) to 18.46 GB (24.04) bug Something isn't working triaged Issue has been triaged by maintainers
#489 opened Jun 3, 2024 by lopuhin
2 of 4 tasks
[Question] Best practises to track inputs and predictions? question Further information is requested triaged Issue has been triaged by maintainers
#475 opened May 24, 2024 by FernandoDorado
random_seed seems to be ignored (or at least inconsistent) for inflight_batcher_llm bug Something isn't working triaged Issue has been triaged by maintainers
#468 opened May 21, 2024 by dyoshida-continua
2 of 4 tasks
unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'name' not found bug Something isn't working triaged Issue has been triaged by maintainers
#467 opened May 21, 2024 by Godlovecui
2 of 4 tasks
How to deploy one model instance across multiple GPUs to tackle the OOM problem? question Further information is requested triaged Issue has been triaged by maintainers
#462 opened May 16, 2024 by shil3754
decoding_mode top_k_top_p does not take effect for llama2 not same with huggingface triaged Issue has been triaged by maintainers
#461 opened May 16, 2024 by yjjiang11
1 of 4 tasks
Tritonserver won't start up running Smaug 34b bug Something isn't working triaged Issue has been triaged by maintainers
#459 opened May 15, 2024 by workuser12345
2 of 4 tasks
two seemingly identical functions in the same file triaged Issue has been triaged by maintainers
#458 opened May 15, 2024 by dongluw
Replace subprocess.Popen with subprocess.run triaged Issue has been triaged by maintainers
#452 opened May 14, 2024 by rlempka Loading…
[tensorrt-llm backend] A question about launch_triton_server.py question Further information is requested triaged Issue has been triaged by maintainers
#455 opened May 13, 2024 by victorsoda
InFlightBatching seems not working need more info triaged Issue has been triaged by maintainers
#442 opened May 6, 2024 by larme
2 of 4 tasks
Deployement failed for BERT triaged Issue has been triaged by maintainers
#440 opened May 3, 2024 by vivekjoshi556
Deploying Mixtral-8x7B-v0.1 with Triton 24.02 on A100 (160GB) raises "Cuda Runtime (out of memory)" exception bug Something isn't working triaged Issue has been triaged by maintainers
#438 opened Apr 29, 2024 by kelkarn
2 of 4 tasks
max_batch_size seems to have no impact on model performance bug Something isn't working triaged Issue has been triaged by maintainers
#429 opened Apr 23, 2024 by VitalyPetrov
3 of 4 tasks
Performance Issue with return_context_logits Enabled in TensorRT-LLM bug Something isn't working triaged Issue has been triaged by maintainers
#419 opened Apr 19, 2024 by metterian
2 of 4 tasks
Filtering beam_search output tensors results in a string output vs list triaged Issue has been triaged by maintainers
#418 opened Apr 18, 2024 by nikhilshandilya
Warmup Example of loading LoRa weights triaged Issue has been triaged by maintainers
#417 opened Apr 18, 2024 by TheCodeWrangler
Block reuse is currently not supported with beam width > 1 triaged Issue has been triaged by maintainers
#411 opened Apr 16, 2024 by tonylek
ProTip! Find all open issues with in progress development work with linked:pr.