Description
Variable bit rate is commonly used in audio and video compression, so why not try on LLMs?
My guess is that a locally adaptive variable bit rate would require a major change to ggml
. So, then, the least one can try is to see if using different number of bits in the different network layers would be beneficial.
As a first step, I simply changed llama.cpp
to not quantize one of the tensor types in addition to output.weight
(which is already known to have a significant impact on generation quality) and calculated perplexity for Q2_4
quantization (see issue #1240). Picked 2-bit quantization because there the difference between a quantized and not quantized tensor will be largest, so it would be easiest to see the effect. The following table summarizes the results (PPL improvement is perplexity with fp16
output.weight
- perplexity with fp16
output weight
+ indicated tensor, table is sorted in decreasing order of impact)
Tensor type | PPL improvement |
---|---|
feed_forward.w2 | 1.0832 |
attention.wv | 0.7819 |
feed_forward.w3 | 0.5658 |
feed_forward.w1 | 0.3917 |
attention.wo | 0.3902 |
attention.wq | 0.1250 |
attention.wk | 0.1090 |
tok_embeddings | < 0.05 |
Interesting to note that the tok_embeddings
tensor, which has been considered worthy of a dedicated quantization type LLAMA_FTYPE_MOSTLY_Q4_1_SOME_F16
where it is kept as fp16
, has basically no influence on generation quality even when quantized with 2 bits.
Based on these findings, I ran 2- and 3-bit perplexity calculations where the top 2 tensors feed_forward.w2
and attention.wv
are quantized using Q5_1
instead of Q2_4
or Q3_4
. Here are the results:
Model | Quantization | file size | Perplexity |
---|---|---|---|
7B | Q2_4 + Q5_1 | 3.16G | 6.8160 |
7B | Q3_4 + Q5_1 | 3.7G | 6.0996 |
13B | Q2_4 + Q5_1 | 6.1G | 5.7880 |
13B | Q3_4 + Q5_1 | 7.1G | 5.3715 |
Interesting to note that the mixed Q3_4 + Q5_1
quantization has a lower perplexity than any 4-bit quantization listed on the main page for the 7B model despite the smaller quantized model size.
I have not explored only using Q5_1
for a subset of the feed_forward.w2
and attention.wv
tensors. The quantization rmse for these two tensor types increases with layer depth, so perhaps it would be sufficient to use a higher bit rate for only the last few layers, thus reducing quantized model size compared to what is given in the above table.
Here are the complete runs for the above table. There is no new quantization type, just a quick hack where I added
if (tensor.name == "output.weight" ||
tensor.name.find("attention.wv.weight") != std::string::npos ||
tensor.name.find("feed_forward.w2.weight") != std::string::npos) {
new_type = GGML_TYPE_Q5_1;
}
just after this line in llama.cpp
.
7B, Q2_4 + Q5_1
main: seed = 1682863345 llama.cpp: loading model from ../build/junk.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 15 (mostly Q2_4) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 59.11 KB llama_model_load_internal: mem required = 5026.65 MB (+ 1026.00 MB per state) llama_init_from_file: kv self size = 256.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
1.61 seconds per pass - ETA 17 minutes
[1]4.8689,[2]5.5380,[3]6.3512,[4]7.0241,[5]7.0748,[6]7.0472,[7]7.2764,[8]7.3672,[9]7.7450,[10]8.0356,[11]8.2671,[12]8.3551,[13]8.2873,[14]8.4518,[15]8.7368,[16]8.2773,[17]8.1054,[18]8.0547,[19]7.6308,[20]7.5978,[21]7.4992,[22]7.3066,[23]7.2867,[24]7.1873,[25]7.1983,[26]7.0220,[27]6.8122,[28]6.7195,[29]6.6220,[30]6.4602,[31]6.4277,[32]6.4450,[33]6.3845,[34]6.4217,[35]6.4431,[36]6.4962,[37]6.4978,[38]6.5093,[39]6.5542,[40]6.6065,[41]6.6159,[42]6.6562,[43]6.6069,[44]6.6684,[45]6.6746,[46]6.6483,[47]6.6718,[48]6.6386,[49]6.6381,[50]6.5934,[51]6.5892,[52]6.5701,[53]6.6233,[54]6.6083,[55]6.5750,[56]6.6126,[57]6.6321,[58]6.6554,[59]6.6693,[60]6.7166,[61]6.7055,[62]6.7665,[63]6.8012,[64]6.8124,[65]6.8597,[66]6.8646,[67]6.8819,[68]6.8975,[69]6.9258,[70]6.9585,[71]6.9829,[72]7.0148,[73]7.0743,[74]7.0740,[75]7.0859,[76]7.0966,[77]7.1070,[78]7.0913,[79]7.1232,[80]7.1149,[81]7.1345,[82]7.1442,[83]7.0841,[84]7.0665,[85]7.0551,[86]7.0270,[87]6.9683,[88]6.9379,[89]6.9188,[90]6.9039,[91]6.9317,[92]6.9235,[93]6.9269,[94]6.9251,[95]6.9596,[96]6.9609,[97]6.9612,[98]6.9529,[99]6.9338,[100]6.9302,[101]6.9593,[102]6.9523,[103]6.9737,[104]6.9851,[105]6.9858,[106]7.0021,[107]6.9985,[108]7.0140,[109]7.0090,[110]7.0046,[111]7.0250,[112]7.0456,[113]7.0524,[114]7.0490,[115]7.0560,[116]7.0459,[117]7.0517,[118]7.0811,[119]7.1028,[120]7.1413,[121]7.1622,[122]7.1874,[123]7.2290,[124]7.2475,[125]7.2360,[126]7.2746,[127]7.3097,[128]7.3423,[129]7.3207,[130]7.3302,[131]7.3224,[132]7.3129,[133]7.3003,[134]7.3145,[135]7.3089,[136]7.2968,[137]7.2890,[138]7.2756,[139]7.2651,[140]7.2619,[141]7.2379,[142]7.2336,[143]7.2057,[144]7.1847,[145]7.1773,[146]7.1654,[147]7.1718,[148]7.1704,[149]7.1673,[150]7.1636,[151]7.1664,[152]7.1540,[153]7.1358,[154]7.1244,[155]7.1304,[156]7.1259,[157]7.1421,[158]7.1439,[159]7.1515,[160]7.1542,[161]7.1698,[162]7.1374,[163]7.1246,[164]7.0971,[165]7.0609,[166]7.0288,[167]6.9864,[168]6.9518,[169]6.9381,[170]6.9242,[171]6.8931,[172]6.8715,[173]6.8532,[174]6.8210,[175]6.7957,[176]6.7799,[177]6.7599,[178]6.7353,[179]6.7163,[180]6.7042,[181]6.6793,[182]6.6581,[183]6.6411,[184]6.6389,[185]6.6304,[186]6.6321,[187]6.6376,[188]6.6348,[189]6.6532,[190]6.6555,[191]6.6786,[192]6.6938,[193]6.7137,[194]6.7257,[195]6.7503,[196]6.7668,[197]6.7887,[198]6.8067,[199]6.8101,[200]6.8141,[201]6.8096,[202]6.8342,[203]6.8424,[204]6.8470,[205]6.8581,[206]6.8657,[207]6.8619,[208]6.8726,[209]6.8780,[210]6.8816,[211]6.8934,[212]6.9011,[213]6.9121,[214]6.9185,[215]6.9231,[216]6.9367,[217]6.9563,[218]6.9708,[219]6.9702,[220]6.9650,[221]6.9601,[222]6.9569,[223]6.9457,[224]6.9389,[225]6.9346,[226]6.9554,[227]6.9661,[228]6.9727,[229]6.9772,[230]6.9739,[231]6.9931,[232]6.9831,[233]6.9635,[234]6.9467,[235]6.9311,[236]6.9226,[237]6.9113,[238]6.9156,[239]6.8994,[240]6.8880,[241]6.8921,[242]6.8965,[243]6.8932,[244]6.8800,[245]6.8776,[246]6.8658,[247]6.8522,[248]6.8445,[249]6.8409,[250]6.8471,[251]6.8399,[252]6.8348,[253]6.8244,[254]6.8182,[255]6.8047,[256]6.7833,[257]6.7706,[258]6.7620,[259]6.7591,[260]6.7503,[261]6.7464,[262]6.7397,[263]6.7344,[264]6.7181,[265]6.7174,[266]6.7143,[267]6.7061,[268]6.7162,[269]6.7142,[270]6.7137,[271]6.7213,[272]6.7260,[273]6.7251,[274]6.7284,[275]6.7389,[276]6.7449,[277]6.7640,[278]6.7748,[279]6.7843,[280]6.7867,[281]6.7962,[282]6.8019,[283]6.8184,[284]6.8264,[285]6.8361,[286]6.8503,[287]6.8506,[288]6.8577,[289]6.8476,[290]6.8306,[291]6.8143,[292]6.7983,[293]6.7837,[294]6.7858,[295]6.7836,[296]6.7889,[297]6.7877,[298]6.7916,[299]6.7881,[300]6.7765,[301]6.7751,[302]6.7663,[303]6.7569,[304]6.7478,[305]6.7439,[306]6.7292,[307]6.7309,[308]6.7356,[309]6.7174,[310]6.7112,[311]6.7044,[312]6.7065,[313]6.6999,[314]6.6979,[315]6.6804,[316]6.6768,[317]6.6583,[318]6.6363,[319]6.6513,[320]6.6646,[321]6.6703,[322]6.6661,[323]6.6598,[324]6.6586,[325]6.6711,[326]6.6704,[327]6.6734,[328]6.6757,[329]6.6834,[330]6.6872,[331]6.7003,[332]6.6962,[333]6.7041,[334]6.6979,[335]6.6901,[336]6.6922,[337]6.6888,[338]6.6896,[339]6.6829,[340]6.6784,[341]6.6871,[342]6.6901,[343]6.6962,[344]6.6964,[345]6.6953,[346]6.6908,[347]6.6946,[348]6.6986,[349]6.6998,[350]6.6957,[351]6.6962,[352]6.6968,[353]6.6902,[354]6.6924,[355]6.6989,[356]6.7021,[357]6.6984,[358]6.7090,[359]6.7119,[360]6.7066,[361]6.7052,[362]6.7144,[363]6.7251,[364]6.7318,[365]6.7372,[366]6.7380,[367]6.7469,[368]6.7429,[369]6.7430,[370]6.7444,[371]6.7381,[372]6.7424,[373]6.7475,[374]6.7446,[375]6.7432,[376]6.7509,[377]6.7460,[378]6.7483,[379]6.7546,[380]6.7455,[381]6.7411,[382]6.7370,[383]6.7360,[384]6.7360,[385]6.7342,[386]6.7340,[387]6.7332,[388]6.7288,[389]6.7223,[390]6.7152,[391]6.7065,[392]6.7026,[393]6.7020,[394]6.7049,[395]6.7036,[396]6.6953,[397]6.7020,[398]6.7058,[399]6.7139,[400]6.7138,[401]6.7148,[402]6.7157,[403]6.7171,[404]6.7237,[405]6.7152,[406]6.7121,[407]6.7122,[408]6.7134,[409]6.7250,[410]6.7371,[411]6.7497,[412]6.7668,[413]6.7790,[414]6.7873,[415]6.7931,[416]6.8026,[417]6.8166,[418]6.8216,[419]6.8294,[420]6.8394,[421]6.8512,[422]6.8556,[423]6.8647,[424]6.8765,[425]6.8857,[426]6.8922,[427]6.8970,[428]6.9061,[429]6.9121,[430]6.9217,[431]6.9381,[432]6.9418,[433]6.9406,[434]6.9354,[435]6.9354,[436]6.9371,[437]6.9470,[438]6.9553,[439]6.9516,[440]6.9503,[441]6.9448,[442]6.9418,[443]6.9432,[444]6.9441,[445]6.9416,[446]6.9440,[447]6.9471,[448]6.9513,[449]6.9487,[450]6.9482,[451]6.9434,[452]6.9351,[453]6.9262,[454]6.9219,[455]6.9226,[456]6.9276,[457]6.9302,[458]6.9276,[459]6.9285,[460]6.9373,[461]6.9347,[462]6.9338,[463]6.9396,[464]6.9390,[465]6.9360,[466]6.9289,[467]6.9300,[468]6.9310,[469]6.9335,[470]6.9342,[471]6.9290,[472]6.9336,[473]6.9272,[474]6.9292,[475]6.9246,[476]6.9274,[477]6.9200,[478]6.9199,[479]6.9293,[480]6.9351,[481]6.9373,[482]6.9325,[483]6.9273,[484]6.9295,[485]6.9287,[486]6.9227,[487]6.9233,[488]6.9219,[489]6.9161,[490]6.9132,[491]6.9109,[492]6.9048,[493]6.9019,[494]6.9008,[495]6.9013,[496]6.8970,[497]6.8923,[498]6.8908,[499]6.8846,[500]6.8749,[501]6.8686,[502]6.8689,[503]6.8673,[504]6.8577,[505]6.8599,[506]6.8605,[507]6.8559,[508]6.8514,[509]6.8506,[510]6.8548,[511]6.8601,[512]6.8632,[513]6.8650,[514]6.8722,[515]6.8664,[516]6.8648,[517]6.8661,[518]6.8655,[519]6.8687,[520]6.8713,[521]6.8724,[522]6.8757,[523]6.8762,[524]6.8826,[525]6.8860,[526]6.8879,[527]6.8899,[528]6.8851,[529]6.8861,[530]6.8799,[531]6.8775,[532]6.8826,[533]6.8846,[534]6.8821,[535]6.8854,[536]6.8789,[537]6.8761,[538]6.8816,[539]6.8823,[540]6.8880,[541]6.8893,[542]6.8899,[543]6.8916,[544]6.8923,[545]6.8906,[546]6.8919,[547]6.8868,[548]6.8803,[549]6.8806,[550]6.8770,[551]6.8732,[552]6.8708,[553]6.8665,[554]6.8638,[555]6.8593,[556]6.8589,[557]6.8627,[558]6.8587,[559]6.8583,[560]6.8578,[561]6.8580,[562]6.8555,[563]6.8559,[564]6.8605,[565]6.8629,[566]6.8624,[567]6.8599,[568]6.8593,[569]6.8572,[570]6.8601,[571]6.8603,[572]6.8609,[573]6.8598,[574]6.8565,[575]6.8570,[576]6.8572,[577]6.8550,[578]6.8537,[579]6.8547,[580]6.8475,[581]6.8430,[582]6.8412,[583]6.8415,[584]6.8412,[585]6.8333,[586]6.8264,[587]6.8268,[588]6.8321,[589]6.8388,[590]6.8420,[591]6.8438,[592]6.8424,[593]6.8377,[594]6.8379,[595]6.8352,[596]6.8387,[597]6.8353,[598]6.8323,[599]6.8340,[600]6.8332,[601]6.8313,[602]6.8341,[603]6.8370,[604]6.8381,[605]6.8409,[606]6.8429,[607]6.8417,[608]6.8373,[609]6.8375,[610]6.8410,[611]6.8390,[612]6.8421,[613]6.8380,[614]6.8332,[615]6.8241,[616]6.8281,[617]6.8211,[618]6.8153,[619]6.8084,[620]6.7923,[621]6.7843,[622]6.7822,[623]6.7841,[624]6.7839,[625]6.7838,[626]6.7822,[627]6.7851,[628]6.7848,[629]6.7844,[630]6.7879,[631]6.7937,[632]6.7994,[633]6.7975,[634]6.8018,[635]6.8029,[636]6.8007,[637]6.7977,[638]6.8010,[639]6.7979,[640]6.7991,[641]6.7990,[642]6.8064,[643]6.8086,[644]6.8094,[645]6.8069,[646]6.8121,[647]6.8090,[648]6.8099,[649]6.8094,[650]6.8144,[651]6.8197,[652]6.8211,[653]6.8252,[654]6.8175,[655]6.8160,
llama_print_timings: load time = 2658.59 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 976009.34 ms / 335360 tokens ( 2.91 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1004906.69 ms
7B, Q3_4 + Q5_1
main: seed = 1682864367 llama.cpp: loading model from ../build/junk1.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 10 (mostly Q3_4) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 59.11 KB llama_model_load_internal: mem required = 5578.28 MB (+ 1026.00 MB per state) llama_init_from_file: kv self size = 256.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
1.69 seconds per pass - ETA 18 minutes
[1]4.3772,[2]4.7902,[3]5.6644,[4]6.2926,[5]6.4094,[6]6.3682,[7]6.5670,[8]6.6689,[9]7.0093,[10]7.2537,[11]7.4667,[12]7.5133,[13]7.4469,[14]7.5018,[15]7.7581,[16]7.3727,[17]7.2507,[18]7.1917,[19]6.8302,[20]6.8186,[21]6.7251,[22]6.5571,[23]6.5252,[24]6.4336,[25]6.4301,[26]6.2650,[27]6.0943,[28]5.9983,[29]5.9072,[30]5.7545,[31]5.7284,[32]5.7503,[33]5.6977,[34]5.7261,[35]5.7495,[36]5.7812,[37]5.7814,[38]5.7930,[39]5.8255,[40]5.8782,[41]5.8900,[42]5.9274,[43]5.8861,[44]5.9426,[45]5.9455,[46]5.9212,[47]5.9424,[48]5.9169,[49]5.9180,[50]5.8766,[51]5.8735,[52]5.8628,[53]5.9067,[54]5.8907,[55]5.8671,[56]5.8970,[57]5.9166,[58]5.9371,[59]5.9534,[60]5.9951,[61]5.9885,[62]6.0481,[63]6.0787,[64]6.0918,[65]6.1345,[66]6.1431,[67]6.1623,[68]6.1757,[69]6.1978,[70]6.2297,[71]6.2521,[72]6.2830,[73]6.3437,[74]6.3470,[75]6.3602,[76]6.3750,[77]6.3875,[78]6.3715,[79]6.4006,[80]6.3955,[81]6.4061,[82]6.4121,[83]6.3619,[84]6.3428,[85]6.3290,[86]6.3068,[87]6.2470,[88]6.2224,[89]6.2023,[90]6.1887,[91]6.2120,[92]6.2071,[93]6.2058,[94]6.2024,[95]6.2297,[96]6.2300,[97]6.2245,[98]6.2201,[99]6.2056,[100]6.2044,[101]6.2303,[102]6.2271,[103]6.2465,[104]6.2540,[105]6.2524,[106]6.2693,[107]6.2680,[108]6.2817,[109]6.2775,[110]6.2736,[111]6.2939,[112]6.3155,[113]6.3180,[114]6.3130,[115]6.3189,[116]6.3090,[117]6.3133,[118]6.3412,[119]6.3632,[120]6.3973,[121]6.4124,[122]6.4367,[123]6.4732,[124]6.4919,[125]6.4823,[126]6.5207,[127]6.5572,[128]6.5871,[129]6.5713,[130]6.5796,[131]6.5760,[132]6.5691,[133]6.5555,[134]6.5644,[135]6.5596,[136]6.5491,[137]6.5408,[138]6.5234,[139]6.5126,[140]6.5101,[141]6.4830,[142]6.4794,[143]6.4498,[144]6.4283,[145]6.4197,[146]6.4070,[147]6.4104,[148]6.4101,[149]6.4047,[150]6.4007,[151]6.4037,[152]6.3940,[153]6.3789,[154]6.3707,[155]6.3774,[156]6.3727,[157]6.3896,[158]6.3943,[159]6.3981,[160]6.4011,[161]6.4143,[162]6.3863,[163]6.3744,[164]6.3508,[165]6.3203,[166]6.2935,[167]6.2558,[168]6.2253,[169]6.2117,[170]6.2011,[171]6.1750,[172]6.1585,[173]6.1421,[174]6.1119,[175]6.0898,[176]6.0779,[177]6.0581,[178]6.0349,[179]6.0179,[180]6.0087,[181]5.9869,[182]5.9695,[183]5.9546,[184]5.9524,[185]5.9448,[186]5.9454,[187]5.9516,[188]5.9485,[189]5.9654,[190]5.9665,[191]5.9878,[192]6.0035,[193]6.0195,[194]6.0306,[195]6.0520,[196]6.0674,[197]6.0883,[198]6.1032,[199]6.1060,[200]6.1102,[201]6.1052,[202]6.1251,[203]6.1329,[204]6.1330,[205]6.1432,[206]6.1498,[207]6.1461,[208]6.1546,[209]6.1593,[210]6.1638,[211]6.1749,[212]6.1826,[213]6.1933,[214]6.1958,[215]6.1980,[216]6.2120,[217]6.2295,[218]6.2431,[219]6.2435,[220]6.2400,[221]6.2343,[222]6.2324,[223]6.2227,[224]6.2156,[225]6.2119,[226]6.2319,[227]6.2393,[228]6.2448,[229]6.2501,[230]6.2464,[231]6.2637,[232]6.2520,[233]6.2350,[234]6.2204,[235]6.2017,[236]6.1944,[237]6.1850,[238]6.1878,[239]6.1730,[240]6.1623,[241]6.1645,[242]6.1681,[243]6.1662,[244]6.1551,[245]6.1519,[246]6.1415,[247]6.1301,[248]6.1230,[249]6.1208,[250]6.1253,[251]6.1191,[252]6.1150,[253]6.1047,[254]6.0999,[255]6.0884,[256]6.0706,[257]6.0587,[258]6.0507,[259]6.0489,[260]6.0406,[261]6.0367,[262]6.0311,[263]6.0259,[264]6.0068,[265]6.0064,[266]6.0045,[267]5.9985,[268]6.0072,[269]6.0053,[270]6.0054,[271]6.0129,[272]6.0167,[273]6.0167,[274]6.0187,[275]6.0269,[276]6.0326,[277]6.0485,[278]6.0587,[279]6.0675,[280]6.0697,[281]6.0794,[282]6.0850,[283]6.0996,[284]6.1077,[285]6.1164,[286]6.1293,[287]6.1286,[288]6.1348,[289]6.1263,[290]6.1109,[291]6.0960,[292]6.0814,[293]6.0681,[294]6.0702,[295]6.0689,[296]6.0739,[297]6.0731,[298]6.0757,[299]6.0734,[300]6.0626,[301]6.0623,[302]6.0550,[303]6.0461,[304]6.0376,[305]6.0340,[306]6.0213,[307]6.0235,[308]6.0259,[309]6.0104,[310]6.0047,[311]5.9983,[312]6.0006,[313]5.9954,[314]5.9937,[315]5.9781,[316]5.9735,[317]5.9570,[318]5.9369,[319]5.9493,[320]5.9613,[321]5.9657,[322]5.9617,[323]5.9553,[324]5.9524,[325]5.9627,[326]5.9630,[327]5.9652,[328]5.9687,[329]5.9742,[330]5.9772,[331]5.9890,[332]5.9866,[333]5.9933,[334]5.9878,[335]5.9821,[336]5.9857,[337]5.9832,[338]5.9830,[339]5.9774,[340]5.9734,[341]5.9818,[342]5.9841,[343]5.9892,[344]5.9893,[345]5.9891,[346]5.9866,[347]5.9903,[348]5.9941,[349]5.9966,[350]5.9933,[351]5.9943,[352]5.9944,[353]5.9882,[354]5.9888,[355]5.9939,[356]5.9971,[357]5.9933,[358]6.0026,[359]6.0051,[360]6.0016,[361]6.0011,[362]6.0076,[363]6.0186,[364]6.0244,[365]6.0288,[366]6.0300,[367]6.0383,[368]6.0357,[369]6.0366,[370]6.0381,[371]6.0325,[372]6.0374,[373]6.0423,[374]6.0407,[375]6.0407,[376]6.0479,[377]6.0431,[378]6.0457,[379]6.0517,[380]6.0436,[381]6.0401,[382]6.0350,[383]6.0341,[384]6.0336,[385]6.0324,[386]6.0319,[387]6.0321,[388]6.0285,[389]6.0233,[390]6.0163,[391]6.0085,[392]6.0043,[393]6.0027,[394]6.0054,[395]6.0045,[396]5.9971,[397]6.0038,[398]6.0070,[399]6.0144,[400]6.0143,[401]6.0162,[402]6.0175,[403]6.0195,[404]6.0260,[405]6.0171,[406]6.0138,[407]6.0137,[408]6.0151,[409]6.0260,[410]6.0373,[411]6.0488,[412]6.0648,[413]6.0758,[414]6.0837,[415]6.0893,[416]6.0976,[417]6.1097,[418]6.1132,[419]6.1196,[420]6.1286,[421]6.1404,[422]6.1437,[423]6.1507,[424]6.1612,[425]6.1700,[426]6.1762,[427]6.1805,[428]6.1889,[429]6.1939,[430]6.2021,[431]6.2158,[432]6.2192,[433]6.2186,[434]6.2143,[435]6.2147,[436]6.2173,[437]6.2269,[438]6.2344,[439]6.2311,[440]6.2303,[441]6.2254,[442]6.2239,[443]6.2248,[444]6.2251,[445]6.2235,[446]6.2256,[447]6.2287,[448]6.2331,[449]6.2309,[450]6.2317,[451]6.2277,[452]6.2160,[453]6.2077,[454]6.2022,[455]6.2032,[456]6.2077,[457]6.2097,[458]6.2074,[459]6.2078,[460]6.2164,[461]6.2138,[462]6.2124,[463]6.2159,[464]6.2147,[465]6.2119,[466]6.2042,[467]6.2045,[468]6.2042,[469]6.2063,[470]6.2065,[471]6.2018,[472]6.2062,[473]6.2008,[474]6.2022,[475]6.1968,[476]6.1979,[477]6.1909,[478]6.1900,[479]6.1963,[480]6.2009,[481]6.2029,[482]6.1985,[483]6.1947,[484]6.1965,[485]6.1950,[486]6.1894,[487]6.1893,[488]6.1870,[489]6.1822,[490]6.1798,[491]6.1769,[492]6.1713,[493]6.1685,[494]6.1666,[495]6.1657,[496]6.1621,[497]6.1566,[498]6.1552,[499]6.1510,[500]6.1418,[501]6.1353,[502]6.1357,[503]6.1348,[504]6.1260,[505]6.1281,[506]6.1291,[507]6.1237,[508]6.1196,[509]6.1191,[510]6.1228,[511]6.1273,[512]6.1307,[513]6.1326,[514]6.1387,[515]6.1333,[516]6.1324,[517]6.1335,[518]6.1329,[519]6.1359,[520]6.1383,[521]6.1395,[522]6.1422,[523]6.1429,[524]6.1484,[525]6.1514,[526]6.1522,[527]6.1539,[528]6.1487,[529]6.1497,[530]6.1448,[531]6.1435,[532]6.1480,[533]6.1501,[534]6.1479,[535]6.1502,[536]6.1447,[537]6.1427,[538]6.1480,[539]6.1493,[540]6.1532,[541]6.1536,[542]6.1547,[543]6.1562,[544]6.1572,[545]6.1552,[546]6.1558,[547]6.1516,[548]6.1467,[549]6.1467,[550]6.1435,[551]6.1400,[552]6.1379,[553]6.1341,[554]6.1320,[555]6.1288,[556]6.1284,[557]6.1311,[558]6.1276,[559]6.1275,[560]6.1274,[561]6.1276,[562]6.1254,[563]6.1250,[564]6.1294,[565]6.1314,[566]6.1311,[567]6.1289,[568]6.1295,[569]6.1282,[570]6.1314,[571]6.1318,[572]6.1329,[573]6.1328,[574]6.1292,[575]6.1285,[576]6.1285,[577]6.1270,[578]6.1253,[579]6.1258,[580]6.1193,[581]6.1157,[582]6.1148,[583]6.1157,[584]6.1159,[585]6.1085,[586]6.1019,[587]6.1027,[588]6.1075,[589]6.1127,[590]6.1157,[591]6.1177,[592]6.1163,[593]6.1129,[594]6.1139,[595]6.1116,[596]6.1149,[597]6.1126,[598]6.1096,[599]6.1119,[600]6.1112,[601]6.1097,[602]6.1113,[603]6.1138,[604]6.1148,[605]6.1181,[606]6.1204,[607]6.1190,[608]6.1156,[609]6.1163,[610]6.1198,[611]6.1182,[612]6.1210,[613]6.1173,[614]6.1121,[615]6.1047,[616]6.1072,[617]6.1011,[618]6.0962,[619]6.0906,[620]6.0769,[621]6.0702,[622]6.0685,[623]6.0700,[624]6.0706,[625]6.0708,[626]6.0697,[627]6.0722,[628]6.0721,[629]6.0717,[630]6.0749,[631]6.0804,[632]6.0860,[633]6.0846,[634]6.0879,[635]6.0883,[636]6.0857,[637]6.0821,[638]6.0845,[639]6.0814,[640]6.0823,[641]6.0823,[642]6.0888,[643]6.0912,[644]6.0926,[645]6.0911,[646]6.0953,[647]6.0917,[648]6.0926,[649]6.0928,[650]6.0968,[651]6.1019,[652]6.1030,[653]6.1068,[654]6.1003,[655]6.0996,
llama_print_timings: load time = 2722.23 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1024717.17 ms / 335360 tokens ( 3.06 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1052144.49 ms
13B, Q2_4 + Q5_1
main: seed = 1682869318 llama.cpp: loading model from ../build/junk3.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 5120 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 40 llama_model_load_internal: n_layer = 40 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 15 (mostly Q2_4) llama_model_load_internal: n_ff = 13824 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 13B llama_model_load_internal: ggml ctx size = 73.73 KB llama_model_load_internal: mem required = 8284.13 MB (+ 1608.00 MB per state) llama_init_from_file: kv self size = 400.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.66 seconds per pass - ETA 29 minutes
[1]3.9721,[2]4.4565,[3]5.2597,[4]5.9033,[5]6.0338,[6]5.9622,[7]6.1601,[8]6.2656,[9]6.5236,[10]6.7829,[11]7.0013,[12]7.0990,[13]7.0476,[14]7.1859,[15]7.4172,[16]7.0334,[17]6.9135,[18]6.8786,[19]6.5400,[20]6.5070,[21]6.4154,[22]6.2361,[23]6.1969,[24]6.0978,[25]6.1050,[26]5.9363,[27]5.7476,[28]5.6483,[29]5.5673,[30]5.4279,[31]5.3936,[32]5.4050,[33]5.3676,[34]5.4138,[35]5.4320,[36]5.4581,[37]5.4515,[38]5.4411,[39]5.4719,[40]5.5259,[41]5.5538,[42]5.5976,[43]5.5570,[44]5.6013,[45]5.6034,[46]5.5709,[47]5.5914,[48]5.5665,[49]5.5732,[50]5.5425,[51]5.5525,[52]5.5445,[53]5.5919,[54]5.5826,[55]5.5603,[56]5.5833,[57]5.6051,[58]5.6318,[59]5.6509,[60]5.6875,[61]5.6795,[62]5.7390,[63]5.7665,[64]5.7749,[65]5.8128,[66]5.8121,[67]5.8288,[68]5.8395,[69]5.8698,[70]5.9008,[71]5.9260,[72]5.9637,[73]6.0136,[74]6.0186,[75]6.0287,[76]6.0452,[77]6.0630,[78]6.0533,[79]6.0811,[80]6.0773,[81]6.0940,[82]6.0916,[83]6.0430,[84]6.0330,[85]6.0263,[86]6.0075,[87]5.9484,[88]5.9136,[89]5.8907,[90]5.8817,[91]5.9066,[92]5.9045,[93]5.9073,[94]5.9044,[95]5.9318,[96]5.9287,[97]5.9266,[98]5.9230,[99]5.9158,[100]5.9112,[101]5.9382,[102]5.9320,[103]5.9477,[104]5.9492,[105]5.9504,[106]5.9664,[107]5.9629,[108]5.9788,[109]5.9759,[110]5.9701,[111]5.9873,[112]6.0048,[113]6.0053,[114]6.0034,[115]6.0066,[116]5.9956,[117]5.9974,[118]6.0221,[119]6.0409,[120]6.0689,[121]6.0859,[122]6.1071,[123]6.1462,[124]6.1624,[125]6.1543,[126]6.1905,[127]6.2246,[128]6.2549,[129]6.2404,[130]6.2488,[131]6.2446,[132]6.2396,[133]6.2282,[134]6.2393,[135]6.2381,[136]6.2292,[137]6.2257,[138]6.2107,[139]6.2011,[140]6.2007,[141]6.1759,[142]6.1720,[143]6.1485,[144]6.1310,[145]6.1232,[146]6.1096,[147]6.1134,[148]6.1171,[149]6.1134,[150]6.1137,[151]6.1188,[152]6.1085,[153]6.0971,[154]6.0912,[155]6.0971,[156]6.0959,[157]6.1130,[158]6.1163,[159]6.1179,[160]6.1222,[161]6.1343,[162]6.1058,[163]6.0948,[164]6.0712,[165]6.0433,[166]6.0157,[167]5.9800,[168]5.9506,[169]5.9372,[170]5.9281,[171]5.9059,[172]5.8906,[173]5.8760,[174]5.8466,[175]5.8240,[176]5.8097,[177]5.7904,[178]5.7676,[179]5.7527,[180]5.7448,[181]5.7262,[182]5.7087,[183]5.6951,[184]5.6934,[185]5.6880,[186]5.6896,[187]5.6955,[188]5.6950,[189]5.7125,[190]5.7137,[191]5.7318,[192]5.7454,[193]5.7611,[194]5.7725,[195]5.7933,[196]5.8061,[197]5.8252,[198]5.8379,[199]5.8412,[200]5.8429,[201]5.8373,[202]5.8553,[203]5.8624,[204]5.8614,[205]5.8712,[206]5.8764,[207]5.8730,[208]5.8796,[209]5.8831,[210]5.8888,[211]5.8998,[212]5.9059,[213]5.9151,[214]5.9187,[215]5.9211,[216]5.9327,[217]5.9480,[218]5.9628,[219]5.9616,[220]5.9575,[221]5.9532,[222]5.9522,[223]5.9451,[224]5.9380,[225]5.9342,[226]5.9546,[227]5.9632,[228]5.9704,[229]5.9768,[230]5.9741,[231]5.9897,[232]5.9793,[233]5.9619,[234]5.9460,[235]5.9269,[236]5.9199,[237]5.9101,[238]5.9136,[239]5.9005,[240]5.8905,[241]5.8933,[242]5.8951,[243]5.8940,[244]5.8841,[245]5.8807,[246]5.8702,[247]5.8600,[248]5.8524,[249]5.8490,[250]5.8530,[251]5.8444,[252]5.8398,[253]5.8301,[254]5.8258,[255]5.8145,[256]5.7966,[257]5.7849,[258]5.7773,[259]5.7751,[260]5.7663,[261]5.7617,[262]5.7569,[263]5.7506,[264]5.7336,[265]5.7324,[266]5.7288,[267]5.7218,[268]5.7296,[269]5.7292,[270]5.7288,[271]5.7352,[272]5.7392,[273]5.7404,[274]5.7412,[275]5.7480,[276]5.7552,[277]5.7693,[278]5.7775,[279]5.7851,[280]5.7886,[281]5.7987,[282]5.8037,[283]5.8164,[284]5.8253,[285]5.8337,[286]5.8472,[287]5.8436,[288]5.8492,[289]5.8424,[290]5.8276,[291]5.8144,[292]5.8000,[293]5.7865,[294]5.7876,[295]5.7878,[296]5.7926,[297]5.7915,[298]5.7945,[299]5.7927,[300]5.7833,[301]5.7830,[302]5.7765,[303]5.7683,[304]5.7593,[305]5.7559,[306]5.7445,[307]5.7467,[308]5.7477,[309]5.7330,[310]5.7289,[311]5.7237,[312]5.7254,[313]5.7193,[314]5.7175,[315]5.7028,[316]5.7002,[317]5.6864,[318]5.6684,[319]5.6795,[320]5.6913,[321]5.6957,[322]5.6918,[323]5.6866,[324]5.6844,[325]5.6958,[326]5.6973,[327]5.6980,[328]5.7006,[329]5.7052,[330]5.7076,[331]5.7186,[332]5.7145,[333]5.7218,[334]5.7162,[335]5.7112,[336]5.7145,[337]5.7124,[338]5.7122,[339]5.7075,[340]5.7049,[341]5.7114,[342]5.7142,[343]5.7184,[344]5.7186,[345]5.7190,[346]5.7163,[347]5.7199,[348]5.7239,[349]5.7266,[350]5.7247,[351]5.7257,[352]5.7258,[353]5.7197,[354]5.7213,[355]5.7262,[356]5.7298,[357]5.7267,[358]5.7352,[359]5.7368,[360]5.7326,[361]5.7319,[362]5.7393,[363]5.7502,[364]5.7557,[365]5.7603,[366]5.7622,[367]5.7718,[368]5.7690,[369]5.7704,[370]5.7726,[371]5.7684,[372]5.7738,[373]5.7777,[374]5.7762,[375]5.7754,[376]5.7821,[377]5.7776,[378]5.7799,[379]5.7845,[380]5.7770,[381]5.7744,[382]5.7703,[383]5.7691,[384]5.7695,[385]5.7677,[386]5.7658,[387]5.7652,[388]5.7614,[389]5.7571,[390]5.7516,[391]5.7449,[392]5.7416,[393]5.7411,[394]5.7436,[395]5.7422,[396]5.7362,[397]5.7439,[398]5.7481,[399]5.7549,[400]5.7535,[401]5.7535,[402]5.7548,[403]5.7576,[404]5.7632,[405]5.7513,[406]5.7473,[407]5.7472,[408]5.7478,[409]5.7592,[410]5.7692,[411]5.7790,[412]5.7942,[413]5.8061,[414]5.8126,[415]5.8184,[416]5.8260,[417]5.8361,[418]5.8395,[419]5.8448,[420]5.8530,[421]5.8638,[422]5.8671,[423]5.8735,[424]5.8834,[425]5.8915,[426]5.8979,[427]5.9019,[428]5.9094,[429]5.9138,[430]5.9201,[431]5.9333,[432]5.9357,[433]5.9345,[434]5.9307,[435]5.9313,[436]5.9338,[437]5.9427,[438]5.9507,[439]5.9471,[440]5.9469,[441]5.9426,[442]5.9404,[443]5.9406,[444]5.9423,[445]5.9409,[446]5.9426,[447]5.9445,[448]5.9481,[449]5.9461,[450]5.9467,[451]5.9429,[452]5.9315,[453]5.9228,[454]5.9169,[455]5.9165,[456]5.9207,[457]5.9221,[458]5.9201,[459]5.9199,[460]5.9272,[461]5.9232,[462]5.9207,[463]5.9215,[464]5.9202,[465]5.9184,[466]5.9109,[467]5.9116,[468]5.9107,[469]5.9121,[470]5.9116,[471]5.9063,[472]5.9090,[473]5.9043,[474]5.9051,[475]5.8991,[476]5.8992,[477]5.8916,[478]5.8895,[479]5.8934,[480]5.8974,[481]5.8984,[482]5.8939,[483]5.8903,[484]5.8919,[485]5.8881,[486]5.8823,[487]5.8820,[488]5.8798,[489]5.8753,[490]5.8725,[491]5.8696,[492]5.8638,[493]5.8606,[494]5.8583,[495]5.8567,[496]5.8530,[497]5.8476,[498]5.8459,[499]5.8416,[500]5.8336,[501]5.8267,[502]5.8268,[503]5.8254,[504]5.8169,[505]5.8170,[506]5.8175,[507]5.8126,[508]5.8087,[509]5.8087,[510]5.8112,[511]5.8161,[512]5.8199,[513]5.8225,[514]5.8285,[515]5.8238,[516]5.8229,[517]5.8235,[518]5.8235,[519]5.8256,[520]5.8277,[521]5.8293,[522]5.8309,[523]5.8315,[524]5.8376,[525]5.8404,[526]5.8412,[527]5.8427,[528]5.8372,[529]5.8381,[530]5.8332,[531]5.8322,[532]5.8378,[533]5.8404,[534]5.8386,[535]5.8415,[536]5.8367,[537]5.8347,[538]5.8397,[539]5.8405,[540]5.8437,[541]5.8446,[542]5.8459,[543]5.8479,[544]5.8487,[545]5.8477,[546]5.8480,[547]5.8440,[548]5.8391,[549]5.8387,[550]5.8373,[551]5.8341,[552]5.8323,[553]5.8286,[554]5.8264,[555]5.8236,[556]5.8227,[557]5.8249,[558]5.8214,[559]5.8214,[560]5.8201,[561]5.8207,[562]5.8182,[563]5.8181,[564]5.8226,[565]5.8238,[566]5.8239,[567]5.8217,[568]5.8222,[569]5.8203,[570]5.8232,[571]5.8238,[572]5.8245,[573]5.8244,[574]5.8218,[575]5.8204,[576]5.8199,[577]5.8177,[578]5.8157,[579]5.8151,[580]5.8094,[581]5.8061,[582]5.8056,[583]5.8061,[584]5.8063,[585]5.8000,[586]5.7940,[587]5.7944,[588]5.7986,[589]5.8039,[590]5.8066,[591]5.8080,[592]5.8069,[593]5.8037,[594]5.8046,[595]5.8026,[596]5.8064,[597]5.8039,[598]5.8005,[599]5.8030,[600]5.8020,[601]5.8005,[602]5.8020,[603]5.8043,[604]5.8051,[605]5.8076,[606]5.8089,[607]5.8077,[608]5.8045,[609]5.8050,[610]5.8095,[611]5.8080,[612]5.8100,[613]5.8066,[614]5.8025,[615]5.7953,[616]5.7980,[617]5.7922,[618]5.7870,[619]5.7819,[620]5.7695,[621]5.7639,[622]5.7617,[623]5.7633,[624]5.7634,[625]5.7641,[626]5.7634,[627]5.7661,[628]5.7669,[629]5.7676,[630]5.7704,[631]5.7748,[632]5.7803,[633]5.7787,[634]5.7821,[635]5.7818,[636]5.7784,[637]5.7751,[638]5.7773,[639]5.7740,[640]5.7749,[641]5.7755,[642]5.7812,[643]5.7830,[644]5.7850,[645]5.7839,[646]5.7875,[647]5.7830,[648]5.7841,[649]5.7844,[650]5.7874,[651]5.7912,[652]5.7914,[653]5.7952,[654]5.7891,[655]5.7880,
llama_print_timings: load time = 3757.20 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1660958.51 ms / 335360 tokens ( 4.95 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1690237.58 ms
13B, Q3_4 + Q5_1
main: seed = 1682871034 llama.cpp: loading model from ../build/junk2.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 5120 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 40 llama_model_load_internal: n_layer = 40 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 10 (mostly Q3_4) llama_model_load_internal: n_ff = 13824 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 13B llama_model_load_internal: ggml ctx size = 73.73 KB llama_model_load_internal: mem required = 9353.66 MB (+ 1608.00 MB per state) llama_init_from_file: kv self size = 400.00 MBsystem_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.78 seconds per pass - ETA 30 minutes
[1]3.8638,[2]4.2438,[3]5.0285,[4]5.4691,[5]5.6500,[6]5.5749,[7]5.7279,[8]5.8483,[9]6.0969,[10]6.3309,[11]6.5275,[12]6.5851,[13]6.5289,[14]6.6307,[15]6.8287,[16]6.5065,[17]6.4160,[18]6.3973,[19]6.1030,[20]6.0757,[21]6.0022,[22]5.8315,[23]5.8015,[24]5.7102,[25]5.7213,[26]5.5727,[27]5.3968,[28]5.2983,[29]5.2181,[30]5.0845,[31]5.0477,[32]5.0637,[33]5.0127,[34]5.0546,[35]5.0779,[36]5.0973,[37]5.0901,[38]5.0859,[39]5.1136,[40]5.1553,[41]5.1784,[42]5.2145,[43]5.1772,[44]5.2185,[45]5.2210,[46]5.1948,[47]5.2217,[48]5.2018,[49]5.2068,[50]5.1752,[51]5.1815,[52]5.1747,[53]5.2203,[54]5.2098,[55]5.1910,[56]5.2134,[57]5.2298,[58]5.2522,[59]5.2691,[60]5.3031,[61]5.2955,[62]5.3497,[63]5.3745,[64]5.3859,[65]5.4235,[66]5.4215,[67]5.4387,[68]5.4507,[69]5.4795,[70]5.5091,[71]5.5318,[72]5.5665,[73]5.6155,[74]5.6227,[75]5.6330,[76]5.6482,[77]5.6602,[78]5.6456,[79]5.6718,[80]5.6661,[81]5.6766,[82]5.6737,[83]5.6292,[84]5.6184,[85]5.6138,[86]5.5979,[87]5.5348,[88]5.4920,[89]5.4704,[90]5.4603,[91]5.4818,[92]5.4778,[93]5.4791,[94]5.4792,[95]5.5064,[96]5.5025,[97]5.4993,[98]5.4960,[99]5.4891,[100]5.4861,[101]5.5101,[102]5.5053,[103]5.5205,[104]5.5248,[105]5.5265,[106]5.5405,[107]5.5396,[108]5.5541,[109]5.5528,[110]5.5473,[111]5.5658,[112]5.5820,[113]5.5806,[114]5.5798,[115]5.5831,[116]5.5710,[117]5.5707,[118]5.5948,[119]5.6120,[120]5.6409,[121]5.6560,[122]5.6767,[123]5.7139,[124]5.7325,[125]5.7269,[126]5.7621,[127]5.7945,[128]5.8224,[129]5.8098,[130]5.8178,[131]5.8138,[132]5.8098,[133]5.7973,[134]5.8061,[135]5.8063,[136]5.7966,[137]5.7923,[138]5.7782,[139]5.7695,[140]5.7679,[141]5.7406,[142]5.7374,[143]5.7121,[144]5.6965,[145]5.6883,[146]5.6761,[147]5.6809,[148]5.6838,[149]5.6807,[150]5.6801,[151]5.6846,[152]5.6780,[153]5.6682,[154]5.6629,[155]5.6696,[156]5.6674,[157]5.6831,[158]5.6844,[159]5.6853,[160]5.6893,[161]5.7012,[162]5.6757,[163]5.6658,[164]5.6443,[165]5.6185,[166]5.5949,[167]5.5631,[168]5.5362,[169]5.5232,[170]5.5140,[171]5.4926,[172]5.4804,[173]5.4673,[174]5.4398,[175]5.4193,[176]5.4058,[177]5.3890,[178]5.3683,[179]5.3554,[180]5.3479,[181]5.3314,[182]5.3149,[183]5.3022,[184]5.3012,[185]5.2933,[186]5.2949,[187]5.3005,[188]5.2977,[189]5.3145,[190]5.3146,[191]5.3317,[192]5.3451,[193]5.3599,[194]5.3711,[195]5.3902,[196]5.4024,[197]5.4218,[198]5.4356,[199]5.4370,[200]5.4385,[201]5.4321,[202]5.4455,[203]5.4518,[204]5.4465,[205]5.4555,[206]5.4606,[207]5.4563,[208]5.4615,[209]5.4655,[210]5.4714,[211]5.4821,[212]5.4887,[213]5.4977,[214]5.5005,[215]5.5040,[216]5.5156,[217]5.5323,[218]5.5463,[219]5.5465,[220]5.5432,[221]5.5386,[222]5.5383,[223]5.5318,[224]5.5251,[225]5.5218,[226]5.5412,[227]5.5471,[228]5.5546,[229]5.5616,[230]5.5581,[231]5.5732,[232]5.5628,[233]5.5476,[234]5.5337,[235]5.5123,[236]5.5070,[237]5.4979,[238]5.5009,[239]5.4895,[240]5.4804,[241]5.4834,[242]5.4855,[243]5.4848,[244]5.4748,[245]5.4715,[246]5.4611,[247]5.4513,[248]5.4448,[249]5.4415,[250]5.4449,[251]5.4366,[252]5.4321,[253]5.4228,[254]5.4190,[255]5.4096,[256]5.3932,[257]5.3831,[258]5.3762,[259]5.3759,[260]5.3675,[261]5.3624,[262]5.3579,[263]5.3528,[264]5.3296,[265]5.3294,[266]5.3264,[267]5.3203,[268]5.3272,[269]5.3268,[270]5.3278,[271]5.3341,[272]5.3373,[273]5.3387,[274]5.3400,[275]5.3461,[276]5.3526,[277]5.3652,[278]5.3739,[279]5.3822,[280]5.3861,[281]5.3953,[282]5.4006,[283]5.4136,[284]5.4226,[285]5.4297,[286]5.4423,[287]5.4388,[288]5.4440,[289]5.4380,[290]5.4237,[291]5.4105,[292]5.3974,[293]5.3855,[294]5.3862,[295]5.3865,[296]5.3913,[297]5.3903,[298]5.3920,[299]5.3896,[300]5.3805,[301]5.3810,[302]5.3746,[303]5.3659,[304]5.3587,[305]5.3564,[306]5.3456,[307]5.3488,[308]5.3494,[309]5.3358,[310]5.3325,[311]5.3280,[312]5.3295,[313]5.3237,[314]5.3223,[315]5.3092,[316]5.3057,[317]5.2930,[318]5.2766,[319]5.2871,[320]5.2986,[321]5.3035,[322]5.3005,[323]5.2960,[324]5.2941,[325]5.3036,[326]5.3052,[327]5.3061,[328]5.3092,[329]5.3144,[330]5.3167,[331]5.3270,[332]5.3231,[333]5.3307,[334]5.3259,[335]5.3205,[336]5.3228,[337]5.3216,[338]5.3212,[339]5.3170,[340]5.3142,[341]5.3207,[342]5.3238,[343]5.3283,[344]5.3288,[345]5.3303,[346]5.3287,[347]5.3323,[348]5.3360,[349]5.3380,[350]5.3362,[351]5.3372,[352]5.3372,[353]5.3320,[354]5.3332,[355]5.3378,[356]5.3408,[357]5.3376,[358]5.3456,[359]5.3477,[360]5.3443,[361]5.3442,[362]5.3513,[363]5.3619,[364]5.3670,[365]5.3709,[366]5.3726,[367]5.3815,[368]5.3790,[369]5.3802,[370]5.3821,[371]5.3781,[372]5.3829,[373]5.3872,[374]5.3851,[375]5.3848,[376]5.3909,[377]5.3874,[378]5.3900,[379]5.3941,[380]5.3870,[381]5.3839,[382]5.3800,[383]5.3783,[384]5.3780,[385]5.3769,[386]5.3759,[387]5.3759,[388]5.3730,[389]5.3694,[390]5.3640,[391]5.3582,[392]5.3547,[393]5.3541,[394]5.3571,[395]5.3565,[396]5.3512,[397]5.3575,[398]5.3618,[399]5.3687,[400]5.3677,[401]5.3684,[402]5.3694,[403]5.3719,[404]5.3775,[405]5.3626,[406]5.3584,[407]5.3576,[408]5.3585,[409]5.3694,[410]5.3786,[411]5.3883,[412]5.4022,[413]5.4125,[414]5.4185,[415]5.4245,[416]5.4314,[417]5.4411,[418]5.4435,[419]5.4482,[420]5.4558,[421]5.4656,[422]5.4689,[423]5.4741,[424]5.4831,[425]5.4906,[426]5.4966,[427]5.5008,[428]5.5078,[429]5.5114,[430]5.5176,[431]5.5305,[432]5.5337,[433]5.5328,[434]5.5292,[435]5.5305,[436]5.5332,[437]5.5416,[438]5.5489,[439]5.5459,[440]5.5453,[441]5.5407,[442]5.5392,[443]5.5403,[444]5.5420,[445]5.5411,[446]5.5430,[447]5.5452,[448]5.5483,[449]5.5468,[450]5.5479,[451]5.5449,[452]5.5295,[453]5.5197,[454]5.5145,[455]5.5148,[456]5.5189,[457]5.5200,[458]5.5180,[459]5.5180,[460]5.5252,[461]5.5215,[462]5.5181,[463]5.5165,[464]5.5162,[465]5.5143,[466]5.5069,[467]5.5058,[468]5.5037,[469]5.5049,[470]5.5041,[471]5.4993,[472]5.5002,[473]5.4956,[474]5.4943,[475]5.4876,[476]5.4859,[477]5.4775,[478]5.4748,[479]5.4757,[480]5.4784,[481]5.4786,[482]5.4740,[483]5.4698,[484]5.4706,[485]5.4645,[486]5.4580,[487]5.4569,[488]5.4541,[489]5.4487,[490]5.4458,[491]5.4424,[492]5.4359,[493]5.4332,[494]5.4314,[495]5.4290,[496]5.4250,[497]5.4188,[498]5.4162,[499]5.4126,[500]5.4045,[501]5.3978,[502]5.3970,[503]5.3960,[504]5.3887,[505]5.3887,[506]5.3894,[507]5.3838,[508]5.3802,[509]5.3806,[510]5.3827,[511]5.3868,[512]5.3907,[513]5.3932,[514]5.3987,[515]5.3948,[516]5.3937,[517]5.3936,[518]5.3936,[519]5.3960,[520]5.3972,[521]5.3983,[522]5.4000,[523]5.4006,[524]5.4058,[525]5.4084,[526]5.4089,[527]5.4106,[528]5.4052,[529]5.4058,[530]5.4021,[531]5.4018,[532]5.4066,[533]5.4091,[534]5.4073,[535]5.4099,[536]5.4055,[537]5.4036,[538]5.4085,[539]5.4093,[540]5.4112,[541]5.4111,[542]5.4125,[543]5.4145,[544]5.4157,[545]5.4145,[546]5.4147,[547]5.4114,[548]5.4073,[549]5.4074,[550]5.4054,[551]5.4028,[552]5.4010,[553]5.3980,[554]5.3957,[555]5.3938,[556]5.3928,[557]5.3943,[558]5.3909,[559]5.3915,[560]5.3903,[561]5.3906,[562]5.3881,[563]5.3879,[564]5.3921,[565]5.3932,[566]5.3936,[567]5.3919,[568]5.3927,[569]5.3912,[570]5.3937,[571]5.3951,[572]5.3961,[573]5.3968,[574]5.3937,[575]5.3920,[576]5.3915,[577]5.3898,[578]5.3881,[579]5.3883,[580]5.3830,[581]5.3801,[582]5.3801,[583]5.3810,[584]5.3814,[585]5.3756,[586]5.3698,[587]5.3703,[588]5.3746,[589]5.3796,[590]5.3827,[591]5.3843,[592]5.3831,[593]5.3792,[594]5.3805,[595]5.3790,[596]5.3829,[597]5.3809,[598]5.3778,[599]5.3806,[600]5.3795,[601]5.3784,[602]5.3789,[603]5.3818,[604]5.3826,[605]5.3855,[606]5.3871,[607]5.3854,[608]5.3826,[609]5.3834,[610]5.3875,[611]5.3863,[612]5.3887,[613]5.3859,[614]5.3819,[615]5.3760,[616]5.3784,[617]5.3734,[618]5.3688,[619]5.3644,[620]5.3533,[621]5.3482,[622]5.3463,[623]5.3477,[624]5.3481,[625]5.3489,[626]5.3484,[627]5.3511,[628]5.3519,[629]5.3525,[630]5.3553,[631]5.3598,[632]5.3644,[633]5.3633,[634]5.3663,[635]5.3660,[636]5.3623,[637]5.3585,[638]5.3606,[639]5.3574,[640]5.3579,[641]5.3584,[642]5.3637,[643]5.3654,[644]5.3676,[645]5.3662,[646]5.3699,[647]5.3651,[648]5.3664,[649]5.3667,[650]5.3695,[651]5.3737,[652]5.3742,[653]5.3779,[654]5.3724,[655]5.3715,
llama_print_timings: load time = 3902.02 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1737619.18 ms / 335360 tokens ( 5.18 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1764928.09 ms