Skip to content

Update quantization overview doc page #10603

Open
@GregoryComer

Description

@GregoryComer

📚 The doc issue

The current quantization overview page is a bit sparse: https://pytorch.org/executorch/main/quantization-overview.html. I'd like to update it as follows:

  • Move under Usage/ since it's the only page under Quantization/ currently.
  • Split out information intended for backend authors (info about writing a quantizer, for example). Focus on user-facing APIs.
  • Document backend-invariant quantization flows (embeddings, ao kernels, etc.). Include info (and example) on composable quantizer.
  • Document PT2E and quantize_ flows.
  • Cover the general, high level approach to quantizing different types of models.
    • CV models
    • Transformers / language models
  • Talk briefly about options for evaluating quantized model accuracy (running in eager mode vs pybindings vs on-device, for example)

Suggest a potential alternative/fix

No response

cc @mergennachin @byjlw @kimishpatel @jerryzh168

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: docIssues related to documentation, both in docs/ and inlined in codemodule: quantizationIssues related to quantizationmodule: user experienceIssues related to reducing friction for users

    Type

    Projects

    Status

    To triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions