Closed
Description
Related Issues:
Previously, it wasn't recommended to incorporate non-llama architectures into llama.cpp. However, in light of the recent addition of the Falcon architecture (see Pull Request #2717), it might be worth reconsidering this stance.
One distinguishing feature of Starcoder is its ability to provide a complete series of models ranging from 1B to 13B. This capability can prove highly beneficial for speculative decoding and making coding models available for edge devices (e.g., M1/M2 Macs).
I can contribute the PR if it matches llama.cpp's roadmap.