File tree Expand file tree Collapse file tree 1 file changed +1
-1
lines changed Expand file tree Collapse file tree 1 file changed +1
-1
lines changed Original file line number Diff line number Diff line change @@ -139,7 +139,7 @@ Optionally, you can use the following command-line flags:
139
139
| ` --cpu ` | Use the CPU to generate text.|
140
140
| ` --load-in-8bit ` | Load the model with 8-bit precision.|
141
141
| ` --load-in-4bit ` | Load the model with 4-bit precision. Currently only works with LLaMA.|
142
- | ` --gptq-bits ` | Load a pre-quantized model with specified precision. 2, 3, 4 and 8bit are supported. Currently only works with LLaMA. |
142
+ | ` --gptq-bits GPTQ_BITS ` | Load a pre-quantized model with specified precision. 2, 3, 4 and 8 (bit) are supported. Currently only works with LLaMA. |
143
143
| ` --bf16 ` | Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU. |
144
144
| ` --auto-devices ` | Automatically split the model across the available GPU(s) and CPU.|
145
145
| ` --disk ` | If the model is too large for your GPU(s) and CPU combined, send the remaining layers to the disk. |
You can’t perform that action at this time.
0 commit comments