Skip to content

Issues: ggml-org/llama.cpp

changelog : libllama API
#9289 opened Sep 3, 2024 by ggerganov
Open 9
changelog : llama-server REST API
#9291 opened Sep 3, 2024 by ggerganov
Open 15
tutorials : list for llama.cpp
#13523 opened May 14, 2025 by ggerganov
Open 3
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or ⇧ + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Issues list

ggml-quants : weighted rounding algorithms with cumulative search generation quality Quality of model output ggml changes relating to the ggml tensor library for machine learning Less than 4 bits Efforts related to viable quantized models using <4 bits research πŸ”¬ Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level Tensor Encoding Scheme https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
#12557 opened Mar 25, 2025 by compilade Draft
ggml : add ANE backend help wanted Extra attention is needed research πŸ”¬ roadmap Part of a roadmap project
#10453 opened Nov 22, 2024 by ggerganov
ggml : add WebGPU backend help wanted Extra attention is needed research πŸ”¬ roadmap Part of a roadmap project
#7773 opened Jun 5, 2024 by ggerganov
ggml : add DirectML backend help wanted Extra attention is needed research πŸ”¬ roadmap Part of a roadmap project
#7772 opened Jun 5, 2024 by ggerganov
llama : support Mamba-2 model Model specific research πŸ”¬ roadmap Part of a roadmap project
#7727 opened Jun 4, 2024 by ggerganov
metal : compile-time kernel args and params performance Speed related topics research πŸ”¬ roadmap Part of a roadmap project
#4085 opened Nov 15, 2023 by ggerganov
Layer skipping/self-speculation demo demo Demonstrate some concept or idea, not intended to be merged research πŸ”¬
#3565 opened Oct 10, 2023 by KerfuffleV2 Draft
llama : combined beam search + grammar sampling strategy generation quality Quality of model output good first issue Good for newcomers research πŸ”¬ roadmap Part of a roadmap project
#2923 opened Aug 31, 2023 by ggerganov
Added Arbitrary mixed quantization Less than 4 bits Efforts related to viable quantized models using <4 bits research πŸ”¬
#1834 opened Jun 13, 2023 by Milkdrop Loading…
Q4_0 scale selection using RMSE enhancement New feature or request Less than 4 bits Efforts related to viable quantized models using <4 bits research πŸ”¬ Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#835 opened Apr 7, 2023 by sw Draft
Study how LM Evaluation Harness works and try to implement it enhancement New feature or request generation quality Quality of model output help wanted Extra attention is needed high priority Very important issue research πŸ”¬
#231 opened Mar 17, 2023 by ggerganov
ProTip! Exclude everything labeled bug with -label:bug.