Skip to content

Commit 37036b3

Browse files
authored
Add Phi-4-mini README.md (#9302)
1 parent 07e9672 commit 37036b3

File tree

2 files changed

+64
-1
lines changed

2 files changed

+64
-1
lines changed

examples/models/phi-4-mini/README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
## Summary
2+
Phi-4-mini Instruct (3.8B) is a newly released version of the popular Phi-4 model developed by Microsoft.
3+
4+
## Instructions
5+
6+
Phi-4-mini uses the same example code as Llama, while the checkpoint, model params, and tokenizer are different. Please see the [Llama README page](../llama/README.md) for details.
7+
8+
All commands for exporting and running Llama on various backends should also be applicable to Phi-4-mini, by swapping the following args:
9+
```
10+
--model phi_4_mini
11+
--params examples/models/phi-4-mini/config.json
12+
--checkpoint <path-to-meta-checkpoint>
13+
```
14+
15+
### Generate the Checkpoint
16+
The original checkpoint can be obtained from HuggingFace:
17+
```
18+
huggingface-cli download microsoft/Phi-4-mini-instruct
19+
```
20+
21+
We then convert it to Meta's checkpoint format:
22+
```
23+
python examples/models/phi-4-mini/convert_weights.py <path-to-checkpoint-dir> <output-path>
24+
```
25+
26+
### Example export and run
27+
Here is an basic example for exporting and running Phi-4-mini, although please refer to [Llama README page](../llama/README.md) for more advanced usage.
28+
29+
Export to XNNPack, no quantization:
30+
```
31+
# No quantization
32+
# Set these paths to point to the downloaded files
33+
PHI_CHECKPOINT=path/to/checkpoint.pth
34+
35+
python -m examples.models.llama.export_llama \
36+
--model phi_4_mini \
37+
--checkpoint "${PHI_CHECKPOINT=path/to/checkpoint.pth:?}" \
38+
--params examples/models/phi-4-mini/config.json \
39+
-kv \
40+
--use_sdpa_with_kv_cache \
41+
-d fp32 \
42+
-X \
43+
--metadata '{"get_bos_id":151643, "get_eos_ids":[151643]}' \
44+
--output_name="phi-4-mini.pte"
45+
--verbose
46+
```
47+
48+
Run using the executor runner:
49+
```
50+
# Currently a work in progress, just need to enable HuggingFace json tokenizer in C++.
51+
# In the meantime, can run with an example Python runner with pybindings:
52+
53+
python -m examples.models.llama.runner.native
54+
--model phi_4_mini
55+
--pte <path-to-pte>
56+
-kv
57+
--tokenizer <path-to-tokenizer>/tokenizer.json
58+
--tokenizer_config <path-to_tokenizer>/tokenizer_config.json
59+
--prompt "What is in a california roll?"
60+
--params examples/models/phi-4-mini/config.json
61+
--max_len 64
62+
--temperature 0
63+
```

examples/models/phi-4-mini/convert_weights.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ def main():
7171
"model-00002-of-00002.safetensors",
7272
],
7373
output_dir=".",
74-
model_type="PHI3_MINI",
74+
model_type="PHI4",
7575
)
7676

7777
print("Loading checkpoint...")

0 commit comments

Comments
 (0)