Skip to content

Introduce Vector abstraction #3193

Closed
Closed
@mp911de

Description

@mp911de

With a growing number of databases now supporting vector data (array of floating-point numbers or quantized int8 (8-bit-integer)) we should explore introducing a Vector abstraction.

This is mainly to simplify declaration, portability, and default storage options.

In a domain model, one could declare:

class Article {

  List<Double> embedding;

  List<Float> embedding;

  double[] embedding;

  Double[] embedding;

  CqlVector embedding; // Cassandra

  Vector embedding; // MongoDB
}

By using store-specific types, a domain type becomes no longer portable across databases. We aim to provide an answer for the following questions:

  • What is the ideal property type to declare a vector?
  • How to persist (configure?) the vector if the underlying store provides various storage options?
  • How to handle vector data efficiently and optimize for zero-copy and address mutability issues?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions