-
Notifications
You must be signed in to change notification settings - Fork 12k
Issues: ggml-org/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Research: How to integrate VITA 1.5 for multi-modal GGUF deployment?
research π¬
#13520
opened May 14, 2025 by
jordanqi
5 tasks
ggml-quants : weighted rounding algorithms with cumulative search
generation quality
Quality of model output
ggml
changes relating to the ggml tensor library for machine learning
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research π¬
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Tensor Encoding Scheme
https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
ggml : add ANE backend
help wanted
Extra attention is needed
research π¬
roadmap
Part of a roadmap project
#10453
opened Nov 22, 2024 by
ggerganov
ggml : add WebGPU backend
help wanted
Extra attention is needed
research π¬
roadmap
Part of a roadmap project
#7773
opened Jun 5, 2024 by
ggerganov
ggml : add DirectML backend
help wanted
Extra attention is needed
research π¬
roadmap
Part of a roadmap project
#7772
opened Jun 5, 2024 by
ggerganov
llama : support Mamba-2
model
Model specific
research π¬
roadmap
Part of a roadmap project
#7727
opened Jun 4, 2024 by
ggerganov
metal : compile-time kernel args and params
performance
Speed related topics
research π¬
roadmap
Part of a roadmap project
#4085
opened Nov 15, 2023 by
ggerganov
Layer skipping/self-speculation demo
demo
Demonstrate some concept or idea, not intended to be merged
research π¬
#3565
opened Oct 10, 2023 by
KerfuffleV2
•
Draft
llama : combined beam search + grammar sampling strategy
generation quality
Quality of model output
good first issue
Good for newcomers
research π¬
roadmap
Part of a roadmap project
#2923
opened Aug 31, 2023 by
ggerganov
mpi : attempt inference of 65B LLaMA on a cluster of Raspberry Pis
hardware
Hardware related
help wanted
Extra attention is needed
research π¬
π¦.
llama
#2164
opened Jul 10, 2023 by
ggerganov
[IDEA] Global token enhancement/depression
help wanted
Extra attention is needed
research π¬
#1865
opened Jun 15, 2023 by
elephantpanda
Added Arbitrary mixed quantization
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research π¬
#1834
opened Jun 13, 2023 by
Milkdrop
Loadingβ¦
Q4_0 scale selection using RMSE
enhancement
New feature or request
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research π¬
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
Study how LM Evaluation Harness works and try to implement it
enhancement
New feature or request
generation quality
Quality of model output
help wanted
Extra attention is needed
high priority
Very important issue
research π¬
#231
opened Mar 17, 2023 by
ggerganov
ProTip!
Exclude everything labeled
bug
with -label:bug.