Skip to content

StableLM inference triggers GGML_ASSERT(n_embd_head == hparams.n_rot); #4901

Closed
@brittlewis12

Description

@brittlewis12

Thanks very much for your fantastic work with this library @ggerganov!

Updated to latest llama.cpp (revision: e790eef) this morning.

StableLM-Zephyr-3b quants triggering GGML_ASSERT(n_embd_head == hparams.n_rot); introduced in #4889 (f445c0e#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348efR5533).

It seems the values I see printed for the quants I have (Q4_K_M) are as follows:

llm_load_print_meta: n_rot            = 20
llm_load_print_meta: n_embd_head_k    = 80
llm_load_print_meta: n_embd_head_v    = 80

You mentioned chance of breaking Persimmon:

All models now will use the hparams.n_rot value instead of relying on a custom parameter (like n_embd_head). Both for ggml_rope_custom and llm_build_k_shift. I suspect this might break Persimmon inference, because I'm not sure if hparams.n_rot is correctly populated in the meta data of the model, but if that is the case, then it should be fixed.

Is this a similar issue to Persimmon where the model metadata is incorrect, or is this a distinct problem?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions