Skip to content

[Cadence] add reference quantized fully connected out #9018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
4bebc49
Fix broken tests
metascroy Mar 4, 2025
0740a5d
fix wrong error msg
Gasoonjia Mar 4, 2025
a048c2c
fix head_dim in metadata
navsud Mar 4, 2025
ee2180e
Qualcomm AI Engine Direct - Meta CI for Mobilebert , W2L, and Llama (…
winskuo-quic Mar 4, 2025
0c288c5
Arm backend: Enable test_w2l_u85_BI (#8880)
martinlsm Mar 4, 2025
c080349
Update using-executorch-building-from-source.md (#8925)
mergennachin Mar 4, 2025
b2edd04
Arm backend: Update fuse_batchnorm_pass to create new placeholders (#…
AdrianLundell Mar 4, 2025
c3b7ef9
[minibench] Drop outliers from benchmark result (#8919)
kirklandsign Mar 4, 2025
612a6e1
Fix ANE llama export (#8904)
metascroy Mar 4, 2025
61ee31d
[ExecuTorch][XNNPACK] Don't partition per_tensor weights with qd8 (#8…
pytorchbot Mar 4, 2025
a78101b
Link xnn_executor_runner with optimized op library (#8901)
swolchok Mar 4, 2025
2a11642
[Windows] [Tensor.cpp] add #include <algorithm> (#8912)
SamGondelman Mar 4, 2025
5bb91d5
Add cpu_thread setting logic to xnn_executor_runner (#8902)
swolchok Mar 4, 2025
6d26449
Add a pass to remove certain redundant branched quant/dequant nodes
Vysarat Mar 4, 2025
bf6e71e
[ExecuTorch][XNNPACK] Rename linear weight partitioning flag for clarity
kirklandsign Mar 4, 2025
b09fdce
portable arg{max,min}: optimize update check (#8863)
swolchok Mar 4, 2025
0f48136
Fix trunk.yml (#8949)
metascroy Mar 4, 2025
338d936
[Android demo] Decouple pte file from assets and remove unused
kirklandsign Mar 5, 2025
6bf4e5b
Add Phi-4-mini-instruct (#8856)
jackzhxng Mar 5, 2025
41dd47d
Add optimized kernels to executorch pybindings
larryliu0820 Mar 5, 2025
9c45f2f
fix -Werror -Wunused in executor_runner (#8955)
swolchok Mar 5, 2025
760272c
add BroadcastIndexesRange (#8864)
swolchok Mar 5, 2025
dc957db
Arm backend: Fix Timing Adapter settings depending on the memory mode…
gggekov Mar 5, 2025
2a7e028
[executorch][runtime] Introduce PteDataMap for weight sharing (#8960)
pytorchbot Mar 5, 2025
d5dfaac
introduce file_data_sink
Gasoonjia Mar 5, 2025
927bdda
Add unfold_copy.out
larryliu0820 Mar 5, 2025
2a8e29b
Add max_pool2d_with_indices_backward
manuelcandales Mar 5, 2025
5a24e92
[Windows] [file_data_loader.cpp] Add compat_unistd.h (#8913)
SamGondelman Mar 5, 2025
8660dfb
Add support for ptd in runner (#8957)
pytorchbot Mar 5, 2025
9429381
[Windows] don't use invalid flags on Windows (#8915)
SamGondelman Mar 5, 2025
fe39fd6
Adding Convolution operator optimizations
cad-audio Mar 5, 2025
03a103d
Add proper CMake build for extension_parallel (#8938)
swolchok Mar 5, 2025
55e102f
Qualcomm AI Engine Direct - Remove copy headers mechanism (#8877)
haowhsu-quic Mar 5, 2025
312111c
[executorch][runtime] Add get_named_data_map to Program (#8961)
pytorchbot Mar 5, 2025
5e244df
Default ExecuTorch targets to ExecuTorch-wide Buck visibility (#8969)
swolchok Mar 5, 2025
76dcda3
Fix android demo app java build
kirklandsign Mar 5, 2025
db5f474
[Portable] Easy fix of unfold_copy_out function signature (#8975)
SS-JIA Mar 5, 2025
23b10a0
[Windows] [mmap_data_loader.cpp] mmap equivalent for Windows (#8916)
SamGondelman Mar 5, 2025
522e99b
Remove unused build/test_android_ci.sh (#8976)
kirklandsign Mar 5, 2025
f2ed70e
bump pytorch version (#8922)
cccclai Mar 6, 2025
e4ab6c2
Deploy BroadcastIndexesRange (#8865)
swolchok Mar 6, 2025
9c0f20f
Fix phi4mini test model (#8971)
jackzhxng Mar 6, 2025
9830c26
Revert "[Benchmark] fail test if model artifact does not exist" (#8985)
yangw-dev Mar 6, 2025
704203a
Do not export autogenerated headers. (#8993)
shoumikhin Mar 6, 2025
5193c08
Add extension parallel as a dep to kernels custom framework (#8996)
shoumikhin Mar 6, 2025
ca25c7f
Arm backend: Add FuseViewCopyTransform and FuseConstantsPass in arm_p…
AdrianLundell Mar 6, 2025
a10483f
Update update-viablestrict.yml to ubuntu-22.04 (#8972)
mergennachin Mar 6, 2025
318b015
fix mman header issues (#8989)
SamGondelman Mar 6, 2025
dd60f2e
init
Mar 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .ci/docker/ci_commit_pins/pytorch.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
27e35de6c288bffad1b4d18b393579c1d1a95547
08434df1f2f88c9770e59246caa2ff9c6f613270
27 changes: 27 additions & 0 deletions .ci/scripts/test_ane_static_llama.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/bin/bash
# Copyright (c) Qualcomm Innovation Center, Inc.
# All rights reserved
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

set -exu

source "$(dirname "${BASH_SOURCE[0]}")/utils.sh"

export EXECUTORCH_ROOT="$(dirname "${BASH_SOURCE[0]}")/../.."

if [[ -z "${PYTHON_EXECUTABLE:-}" ]]; then
PYTHON_EXECUTABLE=python3
fi

which "${PYTHON_EXECUTABLE}"

pushd $EXECUTORCH_ROOT/examples/apple/coreml/llama

# Download stories llama110m artifacts
download_stories_model_artifacts

python export.py -n model.pte -p params.json -c stories110M.pt --seq_length 32 --max_seq_length 64 --dtype fp16 --coreml-quantize c4w

popd
18 changes: 17 additions & 1 deletion .ci/scripts/test_model.sh
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,15 @@ test_model() {
rm "./${MODEL_NAME}.pte"
return # Skip running with portable executor runnner since portable doesn't support Qwen's biased linears.
fi
if [[ "${MODEL_NAME}" == "phi-4-mini" ]]; then
# Install requirements for export_llama
bash examples/models/llama/install_requirements.sh
# Test export_llama script: python3 -m examples.models.llama.export_llama.
"${PYTHON_EXECUTABLE}" -m examples.models.llama.export_llama --model "${MODEL_NAME}" -c examples/models/llama/params/demo_rand_params.pth -p examples/models/phi-4-mini/config.json
run_portable_executor_runner
rm "./${MODEL_NAME}.pte"
return
fi

# Export a basic .pte and run the model.
"${PYTHON_EXECUTABLE}" -m examples.portable.scripts.export --model_name="${MODEL_NAME}" "${STRICT}"
Expand Down Expand Up @@ -164,6 +173,7 @@ test_model_with_qnn() {
export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib/x86_64-linux-clang/
export PYTHONPATH=$EXECUTORCH_ROOT/..

EXTRA_FLAGS=""
if [[ "${MODEL_NAME}" == "dl3" ]]; then
EXPORT_SCRIPT=deeplab_v3
elif [[ "${MODEL_NAME}" == "mv3" ]]; then
Expand All @@ -176,6 +186,12 @@ test_model_with_qnn() {
EXPORT_SCRIPT=inception_v3
elif [[ "${MODEL_NAME}" == "vit" ]]; then
EXPORT_SCRIPT=torchvision_vit
elif [[ "${MODEL_NAME}" == "mb" ]]; then
EXPORT_SCRIPT=mobilebert_fine_tune
EXTRA_FLAGS="--num_epochs 1"
pip install scikit-learn
elif [[ "${MODEL_NAME}" == "w2l" ]]; then
EXPORT_SCRIPT=wav2letter
elif [[ "${MODEL_NAME}" == "edsr" ]]; then
EXPORT_SCRIPT=edsr
# Additional deps for edsr
Expand All @@ -189,7 +205,7 @@ test_model_with_qnn() {
# TODO(guangyang): Make QNN chipset matches the target device
QNN_CHIPSET=SM8450

"${PYTHON_EXECUTABLE}" -m examples.qualcomm.scripts.${EXPORT_SCRIPT} -b ${CMAKE_OUTPUT_DIR} -m ${QNN_CHIPSET} --compile_only
"${PYTHON_EXECUTABLE}" -m examples.qualcomm.scripts.${EXPORT_SCRIPT} -b ${CMAKE_OUTPUT_DIR} -m ${QNN_CHIPSET} --compile_only $EXTRA_FLAGS
EXPORTED_MODEL=$(find "./${EXPORT_SCRIPT}" -type f -name "${MODEL_NAME}*.pte" -print -quit)
}

Expand Down
120 changes: 57 additions & 63 deletions .github/workflows/android-perf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,63 @@ jobs:

PYTHONPATH="${PWD}" python .ci/scripts/gather_benchmark_configs.py $ARGS

prepare-test-specs:
runs-on: linux.2xlarge
needs: set-parameters
strategy:
matrix: ${{ fromJson(needs.set-parameters.outputs.benchmark_configs) }}
fail-fast: false
steps:
- uses: actions/checkout@v3

- name: Prepare the spec
id: prepare
shell: bash
env:
BENCHMARK_CONFIG: ${{ toJSON(matrix) }}
working-directory: extension/benchmark/android/benchmark
run: |
set -eux

# The model will be exported in the next step to this S3 path
MODEL_PATH="https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}/model.zip"
# We could write a script to properly use jinja here, but there is only one variable,
# so let's just sed it
sed -i -e 's,{{ model_path }},'"${MODEL_PATH}"',g' android-llm-device-farm-test-spec.yml.j2

BENCHMARK_CONFIG_ID=$(echo "${{ matrix.model }}_${{ matrix.config }}" | sed -e 's/[^A-Za-z0-9._-]/_/g')
# The config for this benchmark runs, we save it in the test spec so that it can be fetched
# later by the upload script
sed -i -e 's,{{ benchmark_config_id }},'"${BENCHMARK_CONFIG_ID}"',g' android-llm-device-farm-test-spec.yml.j2

cp android-llm-device-farm-test-spec.yml.j2 android-llm-device-farm-test-spec.yml
# Just print the test spec for debugging
cat android-llm-device-farm-test-spec.yml

# Save the benchmark configs so that we can use it later in the dashboard
echo "${BENCHMARK_CONFIG}" > "${BENCHMARK_CONFIG_ID}.json"
echo "benchmark-config-id=${BENCHMARK_CONFIG_ID}" >> $GITHUB_OUTPUT

- name: Upload the spec
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}
retention-days: 1
if-no-files-found: error
path: extension/benchmark/android/benchmark/android-llm-device-farm-test-spec.yml

- name: Update the benchmark configs
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/benchmark-configs/
retention-days: 1
if-no-files-found: error
path: extension/benchmark/android/benchmark/${{ steps.prepare.outputs.benchmark-config-id }}.json

export-models:
name: export-models
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
Expand Down Expand Up @@ -278,69 +335,6 @@ jobs:
fi
echo "::endgroup::"

prepare-test-specs:
runs-on: linux.2xlarge
needs:
- set-parameters
- export-models
strategy:
matrix: ${{ fromJson(needs.set-parameters.outputs.benchmark_configs) }}
fail-fast: false
steps:
- uses: actions/checkout@v3

- name: Prepare the spec
id: prepare
shell: bash
env:
BENCHMARK_CONFIG: ${{ toJSON(matrix) }}
working-directory: extension/benchmark/android/benchmark
run: |
set -eux

# The model will be exported in the next step to this S3 path
MODEL_PATH="https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}/model.zip"

# Check if the model artifact exists, fail this step skip generating test-spec.
curl -s --head -f ${MODEL_PATH}

# We could write a script to properly use jinja here, but there is only one variable,
# so let's just sed it
sed -i -e 's,{{ model_path }},'"${MODEL_PATH}"',g' android-llm-device-farm-test-spec.yml.j2

BENCHMARK_CONFIG_ID=$(echo "${{ matrix.model }}_${{ matrix.config }}" | sed -e 's/[^A-Za-z0-9._-]/_/g')
# The config for this benchmark runs, we save it in the test spec so that it can be fetched
# later by the upload script
sed -i -e 's,{{ benchmark_config_id }},'"${BENCHMARK_CONFIG_ID}"',g' android-llm-device-farm-test-spec.yml.j2

cp android-llm-device-farm-test-spec.yml.j2 android-llm-device-farm-test-spec.yml
# Just print the test spec for debugging
cat android-llm-device-farm-test-spec.yml

# Save the benchmark configs so that we can use it later in the dashboard
echo "${BENCHMARK_CONFIG}" > "${BENCHMARK_CONFIG_ID}.json"
echo "benchmark-config-id=${BENCHMARK_CONFIG_ID}" >> $GITHUB_OUTPUT

- name: Upload the spec
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}
retention-days: 1
if-no-files-found: error
path: extension/benchmark/android/benchmark/android-llm-device-farm-test-spec.yml

- name: Update the benchmark configs
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/benchmark-configs/
retention-days: 1
if-no-files-found: error
path: extension/benchmark/android/benchmark/${{ steps.prepare.outputs.benchmark-config-id }}.json

build-benchmark-app:
name: build-benchmark-app
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
Expand Down
119 changes: 57 additions & 62 deletions .github/workflows/apple-perf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,63 @@ jobs:

echo "benchmark_configs is: ${{ steps.set-parameters.outputs.benchmark_configs }}"

prepare-test-specs:
runs-on: linux.2xlarge
needs: set-parameters
strategy:
matrix: ${{ fromJson(needs.set-parameters.outputs.benchmark_configs) }}
fail-fast: false
steps:
- uses: actions/checkout@v3

- name: Prepare the spec
id: prepare
shell: bash
env:
BENCHMARK_CONFIG: ${{ toJSON(matrix) }}
working-directory: extension/benchmark/apple/Benchmark
run: |
set -eux

# The model will be exported in the next step to this S3 path
MODEL_PATH="https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}/model.zip"
# We could write a script to properly use jinja here, but there is only one variable,
# so let's just sed it
sed -i -e 's,{{ model_path }},'"${MODEL_PATH}"',g' default-ios-device-farm-appium-test-spec.yml.j2

BENCHMARK_CONFIG_ID=$(echo "${{ matrix.model }}_${{ matrix.config }}" | sed -e 's/[^A-Za-z0-9._-]/_/g')
# The config for this benchmark runs, we save it in the test spec so that it can be fetched
# later by the upload script
sed -i -e 's,{{ benchmark_config_id }},'"${BENCHMARK_CONFIG_ID}"',g' default-ios-device-farm-appium-test-spec.yml.j2

cp default-ios-device-farm-appium-test-spec.yml.j2 default-ios-device-farm-appium-test-spec.yml
# Just print the test spec for debugging
cat default-ios-device-farm-appium-test-spec.yml

# Save the benchmark configs so that we can use it later in the dashboard
echo "${BENCHMARK_CONFIG}" > "${BENCHMARK_CONFIG_ID}.json"
echo "benchmark-config-id=${BENCHMARK_CONFIG_ID}" >> $GITHUB_OUTPUT

- name: Upload the spec
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}
retention-days: 1
if-no-files-found: error
path: extension/benchmark/apple/Benchmark/default-ios-device-farm-appium-test-spec.yml

- name: Update the benchmark configs
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/benchmark-configs/
retention-days: 1
if-no-files-found: error
path: extension/benchmark/apple/Benchmark/${{ steps.prepare.outputs.benchmark-config-id }}.json

export-models:
name: export-models
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
Expand Down Expand Up @@ -287,68 +344,6 @@ jobs:
fi
echo "::endgroup::"

prepare-test-specs:
runs-on: linux.2xlarge
needs:
- set-parameters
- export-models
strategy:
matrix: ${{ fromJson(needs.set-parameters.outputs.benchmark_configs) }}
fail-fast: false
steps:
- uses: actions/checkout@v3

- name: Prepare the spec
id: prepare
shell: bash
env:
BENCHMARK_CONFIG: ${{ toJSON(matrix) }}
working-directory: extension/benchmark/apple/Benchmark
run: |
set -eux

# The model will be exported in the next step to this S3 path
MODEL_PATH="https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}/model.zip"
# Check if the model artifact exists, fail this step skip generating test-spec.
curl -s --head -f ${MODEL_PATH}
# We could write a script to properly use jinja here, but there is only one variable,
# so let's just sed it
sed -i -e 's,{{ model_path }},'"${MODEL_PATH}"',g' default-ios-device-farm-appium-test-spec.yml.j2

BENCHMARK_CONFIG_ID=$(echo "${{ matrix.model }}_${{ matrix.config }}" | sed -e 's/[^A-Za-z0-9._-]/_/g')
# The config for this benchmark runs, we save it in the test spec so that it can be fetched
# later by the upload script
sed -i -e 's,{{ benchmark_config_id }},'"${BENCHMARK_CONFIG_ID}"',g' default-ios-device-farm-appium-test-spec.yml.j2

cp default-ios-device-farm-appium-test-spec.yml.j2 default-ios-device-farm-appium-test-spec.yml
# Just print the test spec for debugging
cat default-ios-device-farm-appium-test-spec.yml

# Save the benchmark configs so that we can use it later in the dashboard
echo "${BENCHMARK_CONFIG}" > "${BENCHMARK_CONFIG_ID}.json"
echo "benchmark-config-id=${BENCHMARK_CONFIG_ID}" >> $GITHUB_OUTPUT

- name: Upload the spec
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/${{ matrix.model }}_${{ matrix.config }}
retention-days: 1
if-no-files-found: error
path: extension/benchmark/apple/Benchmark/default-ios-device-farm-appium-test-spec.yml

- name: Update the benchmark configs
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifacts/benchmark-configs/
retention-days: 1
if-no-files-found: error
path: extension/benchmark/apple/Benchmark/${{ steps.prepare.outputs.benchmark-config-id }}.json


build-benchmark-app:
name: build-benchmark-app
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
Expand Down
24 changes: 23 additions & 1 deletion .github/workflows/trunk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,28 @@ jobs:
# see if we can import the module successfully
${CONDA_RUN} python -c "from executorch.extension.pybindings import portable_lib; print('success!')"

test-static-llama-ane:
name: test-static-llama-ane
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
with:
runner: macos-m1-stable
python-version: '3.11'
submodules: 'true'
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
script: |
set -eux
bash .ci/scripts/setup-conda.sh
eval "$(conda shell.bash hook)"

# Install requirements
sh install_requirements.sh
sh backends/apple/coreml/scripts/install_requirements.sh
python install_executorch.py --pybind coreml
sh examples/models/llama/install_requirements.sh

# Test ANE llama
sh .ci/scripts/test_ane_static_llama.sh

test-llama-runner-macos:
name: test-llama-runner-mac
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
Expand Down Expand Up @@ -311,7 +333,7 @@ jobs:
strategy:
matrix:
dtype: [fp32]
model: [dl3, mv3, mv2, ic4, ic3, vit]
model: [dl3, mv3, mv2, ic4, ic3, vit, mb, w2l]
fail-fast: false
with:
runner: linux.2xlarge
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/update-viablestrict.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ concurrency:
jobs:
do_update_viablestrict:
if: ${{ github.repository_owner == 'pytorch' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
environment: ${{ (github.event_name == 'schedule') && 'update-viable-strict' || '' }}
steps:
- name: Update viable/strict
Expand Down
Loading
Loading