Skip to content

Commit 997bd3d

Browse files
Merge branch 'main' into dev1/danny/support_qnn_ir_backend
2 parents 8e338dc + 647e1f1 commit 997bd3d

File tree

144 files changed

+1693
-901
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

144 files changed

+1693
-901
lines changed

.github/workflows/android-release-artifacts.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,11 @@ on:
1111
description: Upload the AAR to maven staging repository
1212
required: false
1313
type: boolean
14+
flavor:
15+
type: choice
16+
options:
17+
- "xnnpack"
18+
- "vulkan+xnnpack"
1419
schedule:
1520
- cron: 0 10 * * *
1621

@@ -86,6 +91,11 @@ jobs:
8691
sed -i "s/\(coordinates(\"org.pytorch\", \"executorch-android\", \"\)\([0-9]\+.[0-9]\+.[0-9]\+\)\(\")\)/\1$VERSION\3/" extension/android/executorch_android/build.gradle
8792
fi
8893
94+
FLAVOR="${{ inputs.flavor }}"
95+
if [[ "$FLAVOR" == "vulkan+xnnpack" ]]; then
96+
export EXECUTORCH_BUILD_VULKAN=ON
97+
fi
98+
8999
# Build AAR Package
90100
mkdir aar-out
91101
export BUILD_AAR_DIR=aar-out

.github/workflows/doc-build.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,20 @@ on:
1414
- cron: '0 0 * * *'
1515

1616
jobs:
17+
check-urls:
18+
runs-on: ubuntu-latest
19+
steps:
20+
- uses: actions/checkout@v3
21+
- name: Check URLs
22+
run: bash ./scripts/check_urls.sh
23+
24+
check-xrefs:
25+
runs-on: ubuntu-latest
26+
steps:
27+
- uses: actions/checkout@v3
28+
- name: Check Links
29+
run: bash ./scripts/check_xrefs.sh
30+
1731
build:
1832
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
1933
permissions:

CONTRIBUTING.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,11 @@ executorch
4545
│ └── <a href="devtools/visualization">visualization</a> - Visualization tools for representing model structure and performance metrics.
4646
├── <a href="docs">docs</a> - Static docs tooling and documentation source files.
4747
├── <a href="examples">examples</a> - Examples of various user flows, such as model export, delegates, and runtime execution.
48-
├── <a href="exir">exir</a> - Ahead-of-time library: model capture and lowering APIs. EXport Intermediate Representation (EXIR) is a format for representing the result of <a href="https://pytorch.org/docs/stable/export.html">torch.export</a>. This directory contains utilities and passes for lowering the EXIR graphs into different <a href="/docs/source/ir-exir.md">dialects</a> and eventually suitable to run on target hardware.
48+
├── <a href="exir">exir</a> - Ahead-of-time library: model capture and lowering APIs. EXport Intermediate Representation (EXIR) is a format for representing the result of <a href="https://pytorch.org/docs/stable/export.html">torch.export</a>. This directory contains utilities and passes for lowering the EXIR graphs into different <a href="docs/source/ir-exir.md">dialects</a> and eventually suitable to run on target hardware.
4949
│ ├── <a href="exir/_serialize">_serialize</a> - Serialize final export artifact.
5050
│ ├── <a href="exir/backend">backend</a> - Backend delegate ahead of time APIs.
5151
│ ├── <a href="exir/capture">capture</a> - Program capture.
52-
│ ├── <a href="exir/dialects">dialects</a> - Op sets for various dialects in the export process. Please refer to the <a href="/docs/source/ir-exir.md">EXIR spec</a> and the <a href="/docs/source/compiler-backend-dialect.md">backend dialect</a> doc for more details.
52+
│ ├── <a href="exir/dialects">dialects</a> - Op sets for various dialects in the export process. Please refer to the <a href="docs/source/ir-exir.md">EXIR spec</a> and the <a href="docs/source/compiler-backend-dialect.md">backend dialect</a> doc for more details.
5353
│ ├── <a href="exir/emit">emit</a> - Conversion from ExportedProgram to ExecuTorch execution instructions.
5454
│ ├── <a href="exir/operator">operator</a> - Operator node manipulation utilities.
5555
│ ├── <a href="exir/passes">passes</a> - Built-in compiler passes.
@@ -68,7 +68,7 @@ executorch
6868
│ ├── <a href="extension/memory_allocator">memory_allocator</a> - 1st party memory allocator implementations.
6969
│ ├── <a href="extension/module">module</a> - A simplified C++ wrapper for the runtime. An abstraction that deserializes and executes an ExecuTorch artifact (.pte file). Refer to the <a href="docs/source/extension-module.md">module documentation</a> for more information.
7070
│ ├── <a href="extension/parallel">parallel</a> - C++ threadpool integration.
71-
│ ├── <a href="extension/pybindings">pybindings</a> - Python API for executorch runtime. This is powering up the <a href="docs/source/runtime-python-api-reference.md">runtime Python API</a> for ExecuTorch.
71+
│ ├── <a href="extension/pybindings">pybindings</a> - Python API for executorch runtime. This is powering up the <a href="docs/source/runtime-python-api-reference.rst">runtime Python API</a> for ExecuTorch.
7272
│ ├── <a href="extension/pytree">pytree</a> - C++ and Python flattening and unflattening lib for pytrees.
7373
│ ├── <a href="extension/runner_util">runner_util</a> - Helpers for writing C++ PTE-execution tools.
7474
│ ├── <a href="extension/tensor">tensor</a> - Tensor maker and <code>TensorPtr</code>, details in <a href="docs/source/extension-tensor.md">this documentation</a>. For how to use <code>TensorPtr</code> and <code>Module</code>, please refer to the <a href="docs/source/using-executorch-cpp.md">"Using ExecuTorch with C++"</a> doc.
@@ -114,7 +114,7 @@ If you're completely new to open-source projects, GitHub, or ExecuTorch, please
114114
1. If you've changed APIs or added a new tool or feature, [update the
115115
documentation](#updating-documentation).
116116
1. If you added an experimental API or deprecated an existing API, follow the
117-
[API Life Cycle and Deprecation Policy](/docs/source/api-life-cycle.md).
117+
[API Life Cycle and Deprecation Policy](docs/source/api-life-cycle.md).
118118
1. Make sure your code follows the [style guides](#coding-style) and passes the
119119
[lint checks](#lintrunner).
120120
1. If you haven't already, complete the [Contributor License Agreement ("CLA")](#contributor-license-agreement-cla).

README-wheel.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,6 @@ tutorials and documentation. Here are some starting points:
2525
* [Exporting to ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial)
2626
* Learn the fundamentals of exporting a PyTorch `nn.Module` to ExecuTorch, and
2727
optimizing its performance using quantization and hardware delegation.
28-
* Running LLaMA on [iOS](docs/source/llm/llama-demo-ios) and [Android](docs/source/llm/llama-demo-android) devices.
28+
* Running LLaMA on [iOS](docs/source/llm/llama-demo-ios.md) and [Android](docs/source/llm/llama-demo-android.md) devices.
2929
* Build and run LLaMA in a demo mobile app, and learn how to integrate models
3030
with your own apps.

backends/apple/coreml/runtime/test/setup.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,18 @@ This is a tutorial for setting up tests for the **Core ML** backend.
44

55
## Running tests
66

7-
1. Follow the instructions described in [Setting Up ExecuTorch](/docs/source/getting-started-setup.md) to set up ExecuTorch environment.
7+
1. Follow the instructions described in [Setting Up ExecuTorch](../../../../../docs/source/getting-started-setup.rst) to set up ExecuTorch environment.
88

99
2. Run `install_requirements.sh` to install dependencies required by the **Core ML** backend.
1010

1111
```bash
1212
cd executorch
1313

14-
sh backends/apple/coreml/scripts/install_requirements.sh
14+
sh backends/apple/coreml/scripts/install_requirements.sh
1515

16-
```
16+
```
1717

18-
3. Follow the instructions described in [Building with CMake](/docs/source/runtime-build-and-cross-compilation.md#building-with-cmake) to set up CMake build system.
18+
3. Follow the instructions described in [Building with CMake](../../../../../docs/source/using-executorch-cpp.md#building-with-cmake) to set up CMake build system.
1919

2020
4. Install [Xcode](https://developer.apple.com/xcode/).
2121

@@ -26,7 +26,7 @@ sh backends/apple/coreml/scripts/install_requirements.sh
2626
```bash
2727
cd executorch
2828

29-
# Builds macOS universal test bundle.
29+
# Builds macOS universal test bundle.
3030

3131
sh backends/apple/coreml/srcipts/build_tests.sh
3232

@@ -40,15 +40,15 @@ cd executorch
4040
sh backends/apple/coreml/srcipts/run_tests.sh
4141
4242
```
43-
43+
4444
## Updating tests
4545

4646
1. Open the Xcode workspace.
4747

4848
```bash
4949
cd executorch
5050

51-
# Builds macOS universal test bundle.
51+
# Builds macOS universal test bundle.
5252

5353
open backends/apple/coreml/runtime/workspace/executorchcoreml.xcworkspace
5454

@@ -62,4 +62,4 @@ cd executorch
6262
# There is no need to build the tests.
6363
sh backends/apple/coreml/srcipts/run_tests.sh
6464

65-
```
65+
```

backends/apple/coreml/setup.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This is a tutorial for setting up the Core ML backend.
44

55
## AOT Setup
66

7-
1. Follow the instructions described in [Setting Up ExecuTorch](/docs/source/getting-started-setup.md) to set up ExecuTorch environment.
7+
1. Follow the instructions described in [Setting Up ExecuTorch](../../../docs/source/getting-started-setup.rst) to set up ExecuTorch environment.
88

99

1010
2. Run the example script to validate that the **Core ML** backend is set up correctly.
@@ -28,7 +28,7 @@ delegated_program_manager = edge_program_manager.to_backend(CoreMLPartitioner())
2828

2929
## Integrating Core ML delegate into runtime.
3030

31-
1. Follow the instructions described in [Building with CMake](/docs/source/runtime-build-and-cross-compilation.md#building-with-cmake) to set up CMake build system.
31+
1. Follow the instructions described in [Building with CMake](../../../docs/source/using-executorch-cpp.md#building-with-cmake) to set up CMake build system.
3232

3333
2. Install [Xcode](https://developer.apple.com/xcode/).
3434

backends/apple/mps/mps_preprocess.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from typing import ClassVar, Dict, final, List, Tuple
77

88
import torch
9+
from executorch import exir
910

1011
from executorch.backends.apple.mps.operators.node_visitor import (
1112
get_node_visitors,
@@ -35,6 +36,7 @@
3536

3637
from executorch.exir.passes.memory_format_ops_pass import DimOrderOpsRevertPass
3738
from executorch.exir.program._program import _transform
39+
from executorch.exir.verification.verifier import EXIREdgeDialectVerifier
3840
from torch.export.exported_program import ExportedProgram
3941

4042
FORMAT = "[%(levelname)s %(asctime)s %(filename)s:%(lineno)s] %(message)s"
@@ -87,7 +89,19 @@ def preprocess(
8789
# the `output_ids` array in the schema.
8890

8991
# TODO: Remove this once we have a better support for the dim-order ops.
90-
edge_program = _transform(edge_program, DimOrderOpsRevertPass())
92+
# Need to override the verifier to skip the non dim-order ops from tripping the default verifier.
93+
edge_program = _transform(
94+
edge_program,
95+
DimOrderOpsRevertPass(),
96+
override_verifiers=[
97+
EXIREdgeDialectVerifier(
98+
edge_compile_config=exir.EdgeCompileConfig(
99+
_check_ir_validity=False, # Disable the edge dialect verifier, since we are in the mps backend.
100+
),
101+
class_only=True,
102+
)
103+
],
104+
)
91105

92106
mps_graph = MPSGraph(
93107
version="0",

backends/apple/mps/setup.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ The MPS backend device maps machine learning computational graphs and primitives
1212
:::
1313
:::{grid-item-card} Tutorials we recommend you complete before this:
1414
:class-card: card-prerequisites
15-
* [Introduction to ExecuTorch](intro-how-it-works.md)
16-
* [Setting up ExecuTorch](getting-started-setup.md)
17-
* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
18-
* [ExecuTorch iOS Demo App](demo-apps-ios.md)
19-
* [ExecuTorch iOS LLaMA Demo App](llm/llama-demo-ios.md)
15+
* [Introduction to ExecuTorch](../../../docs/source/intro-how-it-works.md)
16+
* [Setting up ExecuTorch](../../../docs/source/getting-started-setup.rst)
17+
* [Building ExecuTorch with CMake](../../../docs/source/using-executorch-cpp.md#building-with-cmake)
18+
* [ExecuTorch iOS Demo App](../../../docs/source/demo-apps-ios.md)
19+
* [ExecuTorch iOS LLaMA Demo App](../../../docs/source/llm/llama-demo-ios.md)
2020
:::
2121
::::
2222

@@ -111,12 +111,12 @@ python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp
111111
```
112112

113113
### Profiling:
114-
1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model.
114+
1. [Optional] Generate an [ETRecord](../../../docs/source/etrecord.rst) while you're exporting your model.
115115
```bash
116116
cd executorch
117117
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b
118118
```
119-
2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./etdump.md).
119+
2. Run your Program on the ExecuTorch runtime and generate an [ETDump](../../../docs/source/etdump.md).
120120
```
121121
./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs
122122
```

backends/cadence/aot/pass_utils.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,8 @@ class CadencePassAttribute:
3535
ALL_CADENCE_PASSES: dict[ExportPass, CadencePassAttribute] = {}
3636

3737

38-
def get_cadence_pass_attribute(p: ExportPass) -> CadencePassAttribute:
39-
return ALL_CADENCE_PASSES[p]
38+
def get_cadence_pass_attribute(p: ExportPass) -> Optional[CadencePassAttribute]:
39+
return ALL_CADENCE_PASSES.get(p, None)
4040

4141

4242
# A decorator that registers a pass.
@@ -61,7 +61,8 @@ def create_cadence_pass_filter(
6161
def _filter(p: ExportPass) -> bool:
6262
pass_attribute = get_cadence_pass_attribute(p)
6363
return (
64-
pass_attribute.opt_level is not None
64+
pass_attribute is not None
65+
and pass_attribute.opt_level is not None
6566
and pass_attribute.opt_level <= opt_level
6667
and (not pass_attribute.debug_pass or debug)
6768
)

backends/cadence/aot/replace_ops.py

Lines changed: 80 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,9 @@
1717
# pyre-unsafe
1818

1919
import math
20+
import operator
2021
from operator import neg
21-
from typing import cast, Dict, Iterable, Sequence, Set, Tuple
22+
from typing import cast, Dict, Iterable, Optional, Sequence, Set, Tuple
2223

2324
import torch
2425
import torch.fx
@@ -1806,30 +1807,6 @@ def call_operator(self, op, args, kwargs, meta):
18061807
return super().call_operator(op, tuple(new_args), kwargs, meta)
18071808

18081809

1809-
@register_cadence_pass(CadencePassAttribute(opt_level=0))
1810-
class ReplaceAtenLinalgVectorNormWithCadenceLinalgVectorNormPass(ExportPass):
1811-
"""
1812-
Replace the aten.linalg_vector_norm op with a custom op.
1813-
aten.linalg_vector_norm is not supported by Jarvis, so we
1814-
need to replace it with native_batch_norm at all optimization levels.
1815-
"""
1816-
1817-
def call_operator(self, op, args, kwargs, meta):
1818-
if op != exir_ops.edge.aten.linalg_vector_norm.default:
1819-
return super().call_operator(op, args, kwargs, meta)
1820-
1821-
assert (
1822-
len(args) == 1
1823-
), "aten.linalg_vector_norm should have 1 argument (a tensor), we do not support any custom variants"
1824-
1825-
return super().call_operator(
1826-
exir_ops.edge.cadence.linalg_vector_norm.default,
1827-
args,
1828-
kwargs,
1829-
meta,
1830-
)
1831-
1832-
18331810
@register_cadence_pass(CadencePassAttribute(opt_level=1))
18341811
class ReplaceSingleElementTensorArgumentsFromFullOpWithScalarPass(ExportPass):
18351812
"""
@@ -2206,6 +2183,82 @@ def call_operator(
22062183
)
22072184

22082185

2186+
# Adapted from fbcode/pyspeech/opt_passes/replace_ops.py
2187+
@register_cadence_pass(CadencePassAttribute(opt_level=2))
2188+
class ReplaceSplitWithSlicePass(ExportPass):
2189+
"""
2190+
split_with_sizes() delegates to slice() op, so perform this replacement here.
2191+
This avoids the expense of delegation from ATen.
2192+
"""
2193+
2194+
# For split_with_sizes, return the slice dim and extent for each split.
2195+
def get_split_sizes(
2196+
self, graph_module: torch.fx.GraphModule, node: torch.fx.Node
2197+
) -> Optional[list[tuple[int, ...]]]:
2198+
# Parse the args of the split_with_sizes op
2199+
tensor_arg, split_sizes = node.args[0:2]
2200+
assert isinstance(tensor_arg, torch.fx.Node)
2201+
in_shape = get_shape(graph_module, tensor_arg)
2202+
split_dim = 0 if len(node.args) < 3 else node.args[2]
2203+
if in_shape is None:
2204+
return None
2205+
2206+
# Canonicalize the split dimension
2207+
assert isinstance(split_dim, int)
2208+
split_dim = split_dim if split_dim >= 0 else len(in_shape) + split_dim
2209+
2210+
# Create the slice op args corresponding to each split
2211+
slice_ops = []
2212+
split_start = 0
2213+
assert isinstance(split_sizes, list)
2214+
for split_size in split_sizes:
2215+
split_end = split_start + split_size
2216+
slice_args = (split_dim, split_start, split_end)
2217+
slice_ops.append(slice_args)
2218+
split_start = split_end
2219+
2220+
return slice_ops
2221+
2222+
def call(self, graph_module: torch.fx.GraphModule) -> PassResult:
2223+
graph = graph_module.graph
2224+
for node in graph.nodes:
2225+
if not isinstance(node.target, EdgeOpOverload):
2226+
continue
2227+
if (
2228+
get_edge_overload_packet(node.target)
2229+
!= exir_ops.edge.aten.split_with_sizes_copy
2230+
):
2231+
continue
2232+
# All the users of this split_with_sizes op must be getitem ops
2233+
if any(user.target != operator.getitem for user in node.users):
2234+
continue
2235+
2236+
# Get the slice dim and extent for each split
2237+
slice_ops = self.get_split_sizes(graph_module, node)
2238+
if slice_ops is None:
2239+
continue
2240+
2241+
# Go over each getitem user, and replace it with slice op
2242+
for user in list(node.users.keys()):
2243+
assert user.target == operator.getitem
2244+
item_idx = user.args[1]
2245+
assert item_idx < len(slice_ops)
2246+
cur_slice = slice_ops[item_idx]
2247+
with graph.inserting_before(user):
2248+
cur_slice_node = graph.call_function(
2249+
exir_ops.edge.aten.slice_copy.Tensor,
2250+
(node.args[0], cur_slice[0], cur_slice[1], cur_slice[2], 1),
2251+
)
2252+
user.replace_all_uses_with(cur_slice_node)
2253+
graph.erase_node(user)
2254+
2255+
graph.erase_node(node)
2256+
2257+
graph_module.recompile()
2258+
result = super().call(graph_module)
2259+
return result
2260+
2261+
22092262
# This class encapsulates all the functions that replace/switch one op in the
22102263
# graph with another.
22112264
class CadenceReplaceOpsInGraph:
@@ -2243,7 +2296,7 @@ class CadenceReplaceOpsInGraph:
22432296
ReplacePT2DequantWithCadenceDequantPass,
22442297
ReplaceSingleElementTensorArgumentsFromFullOpWithScalarPass,
22452298
ReplaceAtenAvgPoolWithJarvisAvgPoolPass,
2246-
ReplaceAtenLinalgVectorNormWithCadenceLinalgVectorNormPass,
22472299
ReplaceWhereWithFullArgsWithWhereScalar,
2248-
# ReplaceGeluWithApproximateGeluPass,
2300+
ReplaceGeluWithApproximateGeluPass,
2301+
ReplaceSplitWithSlicePass,
22492302
]

0 commit comments

Comments
 (0)