Quantized model repository for different backends

### 🚀 The feature, motivation and pitch

Hi,
I've been wondering if there's a public repository of .pte files for specific backends. For example, while I can generate .pte files for the Llama 3.2 1B model targeting the QNN backend, that model doesn’t work at all—it produces gibberish outputs consistently, as others have also reported.

Unfortunately, I don’t have the compute resources to generate .pte files for larger models like the 8B variant, which might actually work. This leaves the entire exercise stuck.

Is there a community repository or source where precompiled .pte files for larger models and backends are shared?

### Alternatives

_No response_

### Additional context

_No response_

### RFC (Optional)

_No response_

cc @larryliu0820 @mergennachin @cccclai @helunwencser @jackzhxng

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantized model repository for different backends #11034

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quantized model repository for different backends #11034

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions