Replies: 1 comment 1 reply
-
I did some 4096 token tests on base Llama2-70B using wikitext-train:
I don't have a 5.0bpw model handy to test, and I don't have the VRAM to test higher than that. I'd have to fire up a RunPod instance and that would take hours to set up. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
As in the title, I'd like to see any Llama 70B ppl tested on 4096 tokens for every quant size. 70B-chat would be okay. Can't do it on my end because of VRAM.
Beta Was this translation helpful? Give feedback.
All reactions