Skip to content

Commit fc5cdde

Browse files
committed
Update on "[ET-VK] Minor performance improvements to native layer norm."
This diff introduces minor performance improvements to the native layer norm function in the Vulkan backend of Executorch. In this new approach: The mean and variance values are calculated in 2 separate passes. Shader is dispatched based on input texture size, and input texel is read and stored in shared memory. Input stored in shard memory is then summed up using a reduce function. This implementation better utilizes a GPUs parallel processing capabilities. Differential Revision: [D72430290](https://our.internmc.facebook.com/intern/diff/D72430290/) [ghstack-poisoned]
2 parents 0f82910 + e5d5c59 commit fc5cdde

File tree

150 files changed

+4603
-1016
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

150 files changed

+4603
-1016
lines changed

.github/workflows/_android.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,8 @@ jobs:
131131
# https://github.com/ReactiveCircus/android-emulator-runner. The max number
132132
# of cores we can set is 6, any higher number will be reduced to 6.
133133
cores: 6
134-
ram-size: 12288M
134+
ram-size: 16384M
135+
heap-size: 12288M
135136
force-avd-creation: false
136137
disable-animations: true
137138
emulator-options: -no-snapshot-save -no-window -gpu swiftshader_indirect -noaudio -no-boot-anim -camera-back none

CONTRIBUTING.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@ it easy to contribute to this project.
55
## Dev Install
66

77
Set up your environment by following the instructions at
8-
https://pytorch.org/executorch/stable/getting-started-setup.html to clone
8+
https://pytorch.org/executorch/main/getting-started-setup to clone
99
the repo and install the necessary requirements.
1010

11-
Refer to this [document](https://pytorch.org/executorch/main/using-executorch-building-from-source.html) to build ExecuTorch from source.
11+
Refer to this [document](docs/source/using-executorch-building-from-source.md) to build ExecuTorch from source.
1212

1313
### Dev Setup for Android
1414
For Android, please refer to the [Android documentation](docs/source/using-executorch-android.md).
@@ -40,8 +40,8 @@ executorch
4040
├── <a href="devtools">devtools</a> - Model profiling, debugging, and inspection. Please refer to the <a href="docs/source/devtools-overview.md">tools documentation</a> for more information.
4141
│ ├── <a href="devtools/bundled_program">bundled_program</a> - a tool for validating ExecuTorch model. See <a href="docs/source/bundled-io.md">doc</a>.
4242
│ ├── <a href="devtools/etdump">etdump</a> - ETDump - a format for saving profiling and debugging data from runtime. See <a href="docs/source/etdump.md">doc</a>.
43-
│ ├── <a href="devtools/etrecord">etrecord</a> - ETRecord - AOT debug artifact for ExecuTorch. See <a href="https://pytorch.org/executorch/main/etrecord.html">doc</a>.
44-
│ ├── <a href="devtools/inspector">inspector</a> - Python API to inspect ETDump and ETRecord. See <a href="https://pytorch.org/executorch/main/model-inspector.html">doc</a>.
43+
│ ├── <a href="devtools/etrecord">etrecord</a> - ETRecord - AOT debug artifact for ExecuTorch. See <a href="https://pytorch.org/executorch/main/etrecord">doc</a>.
44+
│ ├── <a href="devtools/inspector">inspector</a> - Python API to inspect ETDump and ETRecord. See <a href="https://pytorch.org/executorch/main/model-inspector">doc</a>.
4545
│ └── <a href="devtools/visualization">visualization</a> - Visualization tools for representing model structure and performance metrics.
4646
├── <a href="docs">docs</a> - Static docs tooling and documentation source files.
4747
├── <a href="examples">examples</a> - Examples of various user flows, such as model export, delegates, and runtime execution.
@@ -57,8 +57,8 @@ executorch
5757
│ ├── <a href="exir/serde">serde</a> - Graph module serialization/deserialization.
5858
│ ├── <a href="exir/verification">verification</a> - IR verification.
5959
├── <a href="extension">extension</a> - Extensions built on top of the runtime.
60-
│ ├── <a href="extension/android">android</a> - ExecuTorch wrappers for Android apps. Please refer to the <a href="docs/source/using-executorch-android.md">Android documentation</a> and <a href="https://pytorch.org/executorch/main/javadoc/">Javadoc</a> for more information.
61-
│ ├── <a href="extension/apple">apple</a> - ExecuTorch wrappers for iOS apps. Please refer to the <a href="docs/source/using-executorch-ios.md">iOS documentation</a> and <a href="https://pytorch.org/executorch/stable/apple-runtime.html">how to integrate into Apple platform</a> for more information.
60+
│ ├── <a href="extension/android">android</a> - ExecuTorch wrappers for Android apps. Please refer to the <a href="docs/source/using-executorch-android.md">Android documentation</a> and <a href="https://pytorch.org/executorch/main/javadoc">Javadoc</a> for more information.
61+
│ ├── <a href="extension/apple">apple</a> - ExecuTorch wrappers for iOS apps. Please refer to the <a href="docs/source/using-executorch-ios.md">iOS documentation</a> on how to integrate into Apple platform</a> for more information.
6262
│ ├── <a href="extension/aten_util">aten_util</a> - Converts to and from PyTorch ATen types.
6363
│ ├── <a href="extension/data_loader">data_loader</a> - 1st party data loader implementations.
6464
│ ├── <a href="extension/evalue_util">evalue_util</a> - Helpers for working with EValue objects.
@@ -68,10 +68,10 @@ executorch
6868
│ ├── <a href="extension/memory_allocator">memory_allocator</a> - 1st party memory allocator implementations.
6969
│ ├── <a href="extension/module">module</a> - A simplified C++ wrapper for the runtime. An abstraction that deserializes and executes an ExecuTorch artifact (.pte file). Refer to the <a href="docs/source/extension-module.md">module documentation</a> for more information.
7070
│ ├── <a href="extension/parallel">parallel</a> - C++ threadpool integration.
71-
│ ├── <a href="extension/pybindings">pybindings</a> - Python API for executorch runtime. This is powering up the <a href="https://pytorch.org/executorch/main/runtime-python-api-reference.html">runtime Python API</a> for ExecuTorch.
71+
│ ├── <a href="extension/pybindings">pybindings</a> - Python API for executorch runtime. This is powering up the <a href="docs/source/runtime-python-api-reference.md">runtime Python API</a> for ExecuTorch.
7272
│ ├── <a href="extension/pytree">pytree</a> - C++ and Python flattening and unflattening lib for pytrees.
7373
│ ├── <a href="extension/runner_util">runner_util</a> - Helpers for writing C++ PTE-execution tools.
74-
│ ├── <a href="extension/tensor">tensor</a> - Tensor maker and <code>TensorPtr</code>, details in <a href="/docs/source/extension-tensor.md">this documentation</a>. For how to use <code>TensorPtr</code> and <code>Module</code>, please refer to the <a href="/docs/source/using-executorch-cpp.md">"Using ExecuTorch with C++"</a> doc.
74+
│ ├── <a href="extension/tensor">tensor</a> - Tensor maker and <code>TensorPtr</code>, details in <a href="docs/source/extension-tensor.md">this documentation</a>. For how to use <code>TensorPtr</code> and <code>Module</code>, please refer to the <a href="docs/source/using-executorch-cpp.md">"Using ExecuTorch with C++"</a> doc.
7575
│ ├── <a href="extension/testing_util">testing_util</a> - Helpers for writing C++ tests.
7676
│ ├── <a href="extension/threadpool">threadpool</a> - Threadpool.
7777
│ └── <a href="extension/training">training</a> - Experimental libraries for on-device training.
@@ -85,7 +85,7 @@ executorch
8585
├── <a href="runtime">runtime</a> - Core C++ runtime. These components are used to execute the ExecuTorch program. Please refer to the <a href="docs/source/runtime-overview.md">runtime documentation</a> for more information.
8686
│ ├── <a href="runtime/backend">backend</a> - Backend delegate runtime APIs.
8787
│ ├── <a href="runtime/core">core</a> - Core structures used across all levels of the runtime. Basic components such as <code>Tensor</code>, <code>EValue</code>, <code>Error</code> and <code>Result</code> etc.
88-
│ ├── <a href="runtime/executor">executor</a> - Model loading, initialization, and execution. Runtime components that execute the ExecuTorch program, such as <code>Program</code>, <code>Method</code>. Refer to the <a href="https://pytorch.org/executorch/main/executorch-runtime-api-reference.html">runtime API documentation</a> for more information.
88+
│ ├── <a href="runtime/executor">executor</a> - Model loading, initialization, and execution. Runtime components that execute the ExecuTorch program, such as <code>Program</code>, <code>Method</code>. Refer to the <a href="https://pytorch.org/executorch/main/executorch-runtime-api-reference">runtime API documentation</a> for more information.
8989
│ ├── <a href="runtime/kernel">kernel</a> - Kernel registration and management.
9090
│ └── <a href="runtime/platform">platform</a> - Layer between architecture specific code and portable C++.
9191
├── <a href="schema">schema</a> - ExecuTorch PTE file format flatbuffer schemas.
@@ -102,7 +102,7 @@ executorch
102102
## Contributing workflow
103103
We actively welcome your pull requests (PRs).
104104

105-
If you're completely new to open-source projects, GitHub, or ExecuTorch, please see our [New Contributor Guide](./docs/source/new-contributor-guide.md) for a step-by-step walkthrough on making your first contribution. Otherwise, read on.
105+
If you're completely new to open-source projects, GitHub, or ExecuTorch, please see our [New Contributor Guide](docs/source/new-contributor-guide.md) for a step-by-step walkthrough on making your first contribution. Otherwise, read on.
106106

107107
1. [Claim an issue](#claiming-issues), if present, before starting work. If an
108108
issue doesn't cover the work you plan to do, consider creating one to provide
@@ -245,7 +245,7 @@ modifications to the Google C++ style guide.
245245

246246
### C++ Portability Guidelines
247247

248-
See also [Portable C++ Programming](/docs/source/portable-cpp-programming.md)
248+
See also [Portable C++ Programming](docs/source/portable-cpp-programming.md)
249249
for detailed advice.
250250

251251
#### C++ language version
@@ -417,9 +417,9 @@ for basics.
417417

418418
## For Backend Delegate Authors
419419

420-
- Use [this](/docs/source/backend-delegates-integration.md) guide when
420+
- Use [this](docs/source/backend-delegates-integration.md) guide when
421421
integrating your delegate with ExecuTorch.
422-
- Refer to [this](/docs/source/backend-delegates-dependencies.md) set of
422+
- Refer to [this](docs/source/backend-delegates-dependencies.md) set of
423423
guidelines when including a third-party depenency for your delegate.
424424

425425
&nbsp;

Package.swift

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
//
1616
// For details on building frameworks locally or using prebuilt binaries,
1717
// see the documentation:
18-
// https://pytorch.org/executorch/main/using-executorch-ios.html
18+
// https://pytorch.org/executorch/main/using-executorch-ios
1919

2020
import PackageDescription
2121

README-wheel.md

Lines changed: 9 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -10,32 +10,21 @@ The `executorch` pip package is in beta.
1010

1111
The prebuilt `executorch.runtime` module included in this package provides a way
1212
to run ExecuTorch `.pte` files, with some restrictions:
13-
* Only [core ATen
14-
operators](https://pytorch.org/executorch/stable/ir-ops-set-definition.html)
15-
are linked into the prebuilt module
16-
* Only the [XNNPACK backend
17-
delegate](https://pytorch.org/executorch/main/native-delegates-executorch-xnnpack-delegate.html)
18-
is linked into the prebuilt module.
19-
* \[macOS only] [Core ML](https://pytorch.org/executorch/main/build-run-coreml.html)
20-
and [MPS](https://pytorch.org/executorch/main/build-run-mps.html) backend
21-
delegates are also linked into the prebuilt module.
13+
* Only [core ATen operators](docs/source/ir-ops-set-definition.md) are linked into the prebuilt module
14+
* Only the [XNNPACK backend delegate](docs/source/backends-xnnpack.md) is linked into the prebuilt module.
15+
* \[macOS only] [Core ML](docs/source/backends-coreml.md) and [MPS](docs/source/backends-mps.md) backend
16+
are also linked into the prebuilt module.
2217

23-
Please visit the [ExecuTorch website](https://pytorch.org/executorch/) for
18+
Please visit the [ExecuTorch website](https://pytorch.org/executorch) for
2419
tutorials and documentation. Here are some starting points:
25-
* [Getting
26-
Started](https://pytorch.org/executorch/stable/getting-started-setup.html)
20+
* [Getting Started](https://pytorch.org/executorch/main/getting-started-setup)
2721
* Set up the ExecuTorch environment and run PyTorch models locally.
28-
* [Working with
29-
local LLMs](https://pytorch.org/executorch/stable/llm/getting-started.html)
22+
* [Working with local LLMs](docs/source/llm/getting-started.md)
3023
* Learn how to use ExecuTorch to export and accelerate a large-language model
3124
from scratch.
32-
* [Exporting to
33-
ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial.html)
25+
* [Exporting to ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial)
3426
* Learn the fundamentals of exporting a PyTorch `nn.Module` to ExecuTorch, and
3527
optimizing its performance using quantization and hardware delegation.
36-
* Running LLaMA on
37-
[iOS](https://pytorch.org/executorch/stable/llm/llama-demo-ios.html) and
38-
[Android](https://pytorch.org/executorch/stable/llm/llama-demo-android.html)
39-
devices.
28+
* Running LLaMA on [iOS](docs/source/llm/llama-demo-ios) and [Android](docs/source/llm/llama-demo-android) devices.
4029
* Build and run LLaMA in a demo mobile app, and learn how to integrate models
4130
with your own apps.

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<div align="center">
2-
<img src="./docs/source/_static/img/et-logo.png" alt="Logo" width="200">
2+
<img src="docs/source/_static/img/et-logo.png" alt="Logo" width="200">
33
<h1 align="center">ExecuTorch: A powerful on-device AI Framework</h1>
44
</div>
55

@@ -8,7 +8,7 @@
88
<a href="https://github.com/pytorch/executorch/graphs/contributors"><img src="https://img.shields.io/github/contributors/pytorch/executorch?style=for-the-badge&color=blue" alt="Contributors"></a>
99
<a href="https://github.com/pytorch/executorch/stargazers"><img src="https://img.shields.io/github/stars/pytorch/executorch?style=for-the-badge&color=blue" alt="Stargazers"></a>
1010
<a href="https://discord.gg/Dh43CKSAdc"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="Join our Discord community"></a>
11-
<a href="https://pytorch.org/executorch/stable/index.html"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
11+
<a href="https://pytorch.org/executorch/main/index"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
1212
<hr>
1313
</div>
1414

@@ -49,9 +49,9 @@ Key value propositions of ExecuTorch are:
4949
## Getting Started
5050
To get started you can:
5151

52-
- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/main/index.html) to get things running locally and deploy a model to a device
53-
- Use this [Colab Notebook](https://pytorch.org/executorch/stable/getting-started-setup.html#quick-setup-colab-jupyter-notebook-prototype) to start playing around right away
54-
- Jump straight into LLM use cases by following specific instructions for [Llama](./examples/models/llama/README.md) and [Llava](./examples/models/llava/README.md)
52+
- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/main/index) to get things running locally and deploy a model to a device
53+
- Use this [Colab Notebook](https://pytorch.org/executorch/main/getting-started-setup#quick-setup-colab-jupyter-notebook-prototype) to start playing around right away
54+
- Jump straight into LLM use cases by following specific instructions for [Llama](examples/models/llama/README.md) and [Llava](examples/models/llava/README.md)
5555

5656
## Feedback and Engagement
5757

backends/apple/coreml/runtime/delegate/ETCoreMLAsset.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
#import <Foundation/Foundation.h>
99

10-
#import <asset.h>
10+
#import "asset.h"
1111

1212
NS_ASSUME_NONNULL_BEGIN
1313

backends/apple/coreml/runtime/delegate/ETCoreMLAsset.mm

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
//
66
// Please refer to the license found in the LICENSE file in the root directory of the source tree.
77

8-
#import <ETCoreMLAsset.h>
8+
#import "ETCoreMLAsset.h"
9+
10+
#import "ETCoreMLLogging.h"
11+
#import "objc_safe_cast.h"
912

1013
#import <fcntl.h>
1114
#import <os/lock.h>
1215
#import <stdio.h>
1316
#import <system_error>
14-
15-
#import <objc_safe_cast.h>
16-
1717
namespace {
1818
using namespace executorchcoreml;
1919

@@ -85,6 +85,10 @@ - (void)dealloc {
8585

8686
- (BOOL)_keepAliveAndReturnError:(NSError * __autoreleasing *)error {
8787
if (!_isValid) {
88+
ETCoreMLLogErrorAndSetNSError(error,
89+
ETCoreMLErrorCorruptedModel,
90+
"The asset with identifier = %@ is invalid. Some required asset files appear to be missing.",
91+
_identifier);
8892
return NO;
8993
}
9094

backends/apple/coreml/runtime/delegate/ETCoreMLAssetManager.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
#import <Foundation/Foundation.h>
99

10-
#import <database.hpp>
10+
#import "database.hpp"
1111

1212
@class ETCoreMLAsset;
1313

0 commit comments

Comments
 (0)