You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #5972
## Context
Implement various fixes to Vulkan delegate while performing QA for Vulkan docs. I elected to package everything into this one diff so that it is easy to cherrypick into the 0.4 release.
imported-using-ghimport
Test Plan: Imported from OSS
Reviewed By: jorgep31415
Differential Revision: D64022818
Pulled By: SS-JIA
fbshipit-source-id: 35782970e9db1ab33154ccbae2e10c77d911c041
Copy file name to clipboardExpand all lines: backends/vulkan/docs/android_demo.md
+40-63
Original file line number
Diff line number
Diff line change
@@ -7,13 +7,13 @@ is a native GPU delegate for ExecuTorch.
7
7
::::{grid} 2
8
8
:::{grid-item-card} What you will learn in this tutorial:
9
9
:class-card: card-content
10
-
* How to export the Stories 110M parameter model with partial GPU delegation
10
+
* How to export the Llama3.2-1B parameter model with partial GPU delegation
11
11
* How to execute the partially delegated model on Android
12
12
:::
13
13
:::{grid-item-card} Prerequisites:
14
14
:class-card: card-prerequisites
15
15
* Follow [**Setting up ExecuTorch**](./getting-started-setup.md)
16
-
*Follow [**Setting up the ExecuTorch LLaMA Android Demo App**](./llm/llama-demo-android.md)
16
+
*It is also recommended that you read through [**ExecuTorch Vulkan Delegate**](./native-delegates-executorch-vulkan-delegate.md) and follow the example in that page
17
17
:::
18
18
::::
19
19
@@ -23,65 +23,55 @@ Note that all the steps below should be performed from the ExecuTorch repository
23
23
root directory, and assumes that you have gone through the steps of setting up
24
24
ExecuTorch.
25
25
26
-
You should also refer to the **Prerequisites** section of the [**Setting up the ExecuTorch LLaMA Android Demo App**](./llm/llama-demo-android.md)
27
-
Tutorial in order to install the specified versions of the Android NDK and the
28
-
Android SDK.
26
+
It is also assumed that the Android NDK and Android SDK is installed, and the
27
+
following environment examples are set.
29
28
30
29
```shell
31
-
# Recommended version is Android NDK 26.3.11579264.
32
30
export ANDROID_NDK=<path_to_ndk>
33
-
# Select an appropriate Android ABI
31
+
# Select an appropriate Android ABI for your device
34
32
export ANDROID_ABI=arm64-v8a
35
33
# All subsequent commands should be performed from ExecuTorch repo root
36
34
cd<path_to_executorch_root>
37
35
# Make sure adb works
38
36
adb --version
39
37
```
40
38
41
-
## Lowering the Stories 110M model to Vulkan
39
+
## Lowering the Llama3.2-1B model to Vulkan
42
40
43
41
::::{note}
44
42
The resultant model will only be partially delegated to the Vulkan backend. In
45
43
particular, only binary arithmetic operators (`aten.add`, `aten.sub`,
46
-
`aten.mul`, `aten.div`) and the matrix multiplication operator (`aten.mm`) will
47
-
be executed on the GPU via the Vulkan delegate. The rest of the model will be
48
-
executed using Portable operators. This is because the Vulkan delegate is still
49
-
early in development and currently has limited operator coverage.
50
-
::::
51
-
52
-
First, download `stories110M.pt` and `tokenizer.model` from Github:
0 commit comments