Skip to content

v0.2.2

Latest
Compare
Choose a tag to compare
@cmodi-meta cmodi-meta released this 14 Apr 23:05

Llama Stack SDK 0.2.2 Update

Update SDK to support Llama Stack v0.2.2 which includes multi-image inference.

Local RAG Support

The major update is to enable local RAG. The local RAG implementation is 100% offline and is completely on-device.

The local module SDK supports the end-to-end solution of:

  1. creating a vector DB instance
  2. creating text chunks
  3. Receiving embeddings from the Android app
  4. Storing embeddings in a vector DB
  5. Managing the agent turn with RAG tool call to receive a revalant response from the LLM.

On-device Vector DB solution: ObjectBox

Android Demo App

RAG

We've added a RAG feature in the demo app to help showcase how to use remote RAG and local RAG SDKs. With creating a document object, registering a vector db, and using RagTool from Llama Stack, the remote RAG feature contains all RAG-specific logic.

  • Improved User Experience: The remote RAG feature provides a seamless experience for users, allowing them to ask questions and receive accurate answers quickly.
  • Increased Efficiency: With the ability to process large documents and retrieve relevant information, the remote RAG feature saves time and increases productivity.

In this example, a PDF or text file (i.e. Car Manual) can easily be processed for Question-Answer inference scenarios with the user.

Also with a few code line changes, you can switch to using local RAG. That's the advantage of Llama Stack Mobile SDKs - to be able to interoperate between remote and local without major code changes!

Multi-image Inference

We've built in a sample support for being able to select multiple images and run inference with Llama 4.

Contributors

@ashwinb, @cmodi-meta, @dltn , @Riandy, @seyeong-han, @WuhanMonkey, @yanxi0830