We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent aa71356 commit 3cedb7eCopy full SHA for 3cedb7e
README.md
@@ -10,7 +10,8 @@ Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++
10
11
### Hot topics
12
13
-- ⚠️ Incoming backends: https://github.com/ggerganov/llama.cpp/discussions/5138
+- Deprecated LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD: https://github.com/ggerganov/llama.cpp/discussions/5240
14
+- Incoming backends: https://github.com/ggerganov/llama.cpp/discussions/5138
15
- [SYCL backend](README-sycl.md) is ready (1/28/2024), support Linux/Windows in Intel GPUs (iGPU, Arc/Flex/Max series)
16
- New SOTA quantized models, including pure 2-bits: https://huggingface.co/ikawrakow
17
- Collecting Apple Silicon performance stats:
0 commit comments