Open
Description
🚀 The feature, motivation and pitch
I’m trying to deploy a VITA-1.5 multimodal model (supports audio, vision, and text) using ExecuTorch.
The tokenizer is in Hugging Face tokenizer.json format, and I’d like to ask:
- Is there any suggested way to convert the model into .pte format for ExecuTorch?
- Since this is a new architecture, is there any guidance or examples for adding custom models?
- Can I still use the LlamaDemo Android app with this multimodal?
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
cc @larryliu0820 @mergennachin @cccclai @helunwencser @jackzhxng
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
To triage