Description
When I try to quantize any model (e.g., llama-2-7b-chat) with the command
$ ./quantize models/llama-2-7b-chat-f16.gguf models/test.gguf Q5_K
I receive an error "illegal hardware instruction"
[ 1/ 291] token_embd.weight - [ 4096, 32000, 1, 1], type = f16, quantizing to q5_K .. [1] 92878 illegal hardware instruction ./quantize models/llama-2-7b-chat-f16.gguf Q5_K
I'm using a MacBook Pro M1 with 32 GB of RAM. I tried compiling llama.cpp with METAL support and without, but with both configurations I've got this error. I already checked if my quantization binary is built for the correct architecture:
$ file quantize
quantize: Mach-O 64-bit executable arm64
I tried to find out which instruction is not supported and used lldb and got the following output:
* thread #2, stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x1e011a40)
frame #0: 0x000000010006de40 quantize`quantize_row_q5_K_reference + 924
quantize`quantize_row_q5_K_reference:
-> 0x10006de40 <+924>: .long 0x1e011a40 ; unknown opcode
0x10006de44 <+928>: fcsel s2, s18, s2, gt
0x10006de48 <+932>: fcmp s20, s1
0x10006de4c <+936>: fcsel s1, s20, s1, mi
thread #3, stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x1e011a40)
frame #0: 0x000000010006de40 quantize`quantize_row_q5_K_reference + 924
quantize`quantize_row_q5_K_reference:
-> 0x10006de40 <+924>: .long 0x1e011a40 ; unknown opcode
0x10006de44 <+928>: fcsel s2, s18, s2, gt
0x10006de48 <+932>: fcmp s20, s1
0x10006de4c <+936>: fcsel s1, s20, s1, mi
Target 0: (quantize) stopped.
Does anyone else have this problem? I've already tried redownloading llama, converting the weights (through convert.py
) but always run into that problem. I get the error for both Q4_K
(which should be the same as Q4_K_M
) and Q5_K
. Interestingly, for Q6_K
, Q8_0
, Q4_0
, Q4_1
, etc. everything runs fine.