StableLM inference triggers `GGML_ASSERT(n_embd_head == hparams.n_rot);`

Thanks very much for your fantastic work with this library @ggerganov!

Updated to latest llama.cpp (revision: e790eef21ce659f5c16d59f8a5c8dcf6cde0692a) this morning.

StableLM-Zephyr-3b quants triggering [`GGML_ASSERT(n_embd_head == hparams.n_rot);`](https://github.com/ggerganov/llama.cpp/blob/930f907d3ece1eb5b0a1ec5e209983a66dcbfa68/llama.cpp#L5533) introduced in #4889 (https://github.com/ggerganov/llama.cpp/commit/f445c0e68cf8e1faca0b2aa8dfb9d48231cec301#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348efR5533).

It seems the values I see printed for the [quants I have (Q4_K_M)](https://huggingface.co/TheBloke/stablelm-zephyr-3b-GGUF) are as follows:
```
llm_load_print_meta: n_rot            = 20
llm_load_print_meta: n_embd_head_k    = 80
llm_load_print_meta: n_embd_head_v    = 80
```

You mentioned chance of breaking Persimmon:
> All models now will use the `hparams.n_rot` value instead of relying on a custom parameter (like `n_embd_head`). Both for `ggml_rope_custom` and `llm_build_k_shift`. I suspect this might break Persimmon inference, because I'm not sure if `hparams.n_rot` is correctly populated in the meta data of the model, but if that is the case, then it should be fixed.

Is this a similar issue to Persimmon where the model metadata is incorrect, or is this a distinct problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

StableLM inference triggers `GGML_ASSERT(n_embd_head == hparams.n_rot);` #4901

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

StableLM inference triggers GGML_ASSERT(n_embd_head == hparams.n_rot); #4901

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

StableLM inference triggers `GGML_ASSERT(n_embd_head == hparams.n_rot);` #4901