Skip to content

Issues: ggml-org/llama.cpp

changelog : libllama API
#9289 opened Sep 3, 2024 by ggerganov
Open 9
changelog : llama-server REST API
#9291 opened Sep 3, 2024 by ggerganov
Open 15
tutorials : list for llama.cpp
#13523 opened May 14, 2025 by ggerganov
Open 3
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Issues list

musa: extract ggml_cuda_mul_mat_batched_cublas_gemm_batched_ex ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13887 opened May 29, 2025 by yeahdongcn Loading…
3 tasks done
finetune.cpp command-line arg build Compilation issues examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#13873 opened May 28, 2025 by graehl Loading…
musa: enable fp16 mma (all) and cublas on qy2 ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13842 opened May 28, 2025 by yeahdongcn Loading…
3 tasks done
ggml: improve ggml_backend_cuda_cpy_tensor_async ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13818 opened May 27, 2025 by koush Loading…
cuda: fix CMAKE_CUDA_COMPILER not found error (#13528) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13625 opened May 19, 2025 by lizhenneng Loading…
cuda: set cuda compiler path (#13527) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13528 opened May 14, 2025 by lizhenneng Loading…
musa: restore MUSA graph settings in CMakeLists.txt ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13382 opened May 8, 2025 by yeahdongcn Draft
CUDA: update build CTK version to 12.8 devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13360 opened May 7, 2025 by thevishalagarwal Loading…
cuda: refactored ssm_scan and use CUB ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13291 opened May 4, 2025 by Your-Cheese Loading…
llama : try loading tensors with pre-computed hashes Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning Kompute https://github.com/KomputeProject/kompute/ Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Vulkan Issues specific to the Vulkan backend
#13106 opened Apr 25, 2025 by rgerganov Loading…
[WIP] MUSA: enable fastfp16, correct warp reduce impl and perf tuning ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#12383 opened Mar 14, 2025 by BodhiHu Draft
tool-call: Phi-4 support android Issues specific to Android Apple Metal https://en.wikipedia.org/wiki/Metal_(API) devops improvements to build systems and github actions documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related Vulkan Issues specific to the Vulkan backend
#12288 opened Mar 9, 2025 by jpohhhh Loading…
Overlap CUDA graph building and processing to minimize GPU idle time and improve tokens per seconds performance. ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11867 opened Feb 14, 2025 by aendk Loading…
ggml: move kvalues_iq4nl definition to ggml-common.h ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#11785 opened Feb 10, 2025 by HungMingWu Loading…
Attempt to add the mllama support Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11639 opened Feb 4, 2025 by q82419 Draft
3 of 5 tasks
Add support for Deepseek-R1 flash attention ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11557 opened Jan 31, 2025 by siddartha-RE Loading…
Allow compiling cuda without mmq and flash attention ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11190 opened Jan 11, 2025 by milot-mirdita Loading…
CUDA op getrows fails for long sequences ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11189 opened Jan 11, 2025 by milot-mirdita Loading…
Fix ggml-cuda using a driver symbol in NO_VMM mode ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11188 opened Jan 11, 2025 by milot-mirdita Loading…
ggml-cuda : add TQ2_0 kernels, for ternary inference on GPU Apple Metal https://en.wikipedia.org/wiki/Metal_(API) enhancement New feature or request ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs performance Speed related topics python python script changes Review Complexity : High Generally require indepth knowledge of LLMs or GPUs testing Everything test related
#11183 opened Jan 10, 2025 by compilade Loading…
Fix compilation on Pop!_OS 22.04 LTS CUDA ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#10835 opened Dec 15, 2024 by mika314 Loading…
Performance Tuning for Q4_K matmul CUDA kernel ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#8136 opened Jun 26, 2024 by contentis Loading…
2 of 4 tasks
WIP: Use DirectStorage with CUDA interop to more efficient load tensors build Compilation issues ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7796 opened Jun 6, 2024 by mtavenrath Draft
ggml : add GPU support for Mamba models enhancement New feature or request help wanted Extra attention is needed Nvidia GPU Issues specific to Nvidia GPUs roadmap Part of a roadmap project
#6758 opened Apr 19, 2024 by ggerganov
ProTip! Follow long discussions with comments:>50.