Docs

Run RosalindDB in five minutes

One docker compose up stands up the whole stack — Control Plane, Query Data Plane, three async workers, Postgres, Redis, and a MinIO bucket — on a single Docker network. No signup, no API key.

Quickstart

Four steps. Examples assume the default OSS run mode — RB_REQUIRE_AUTH off, single implicit default tenant, no auth headers. If you've flipped auth on, see Authentication for the bearer-token shape.

Step 1

Run the stack

You need Docker and curl. Nothing else.

git clone https://github.com/rosalinddb/rosalinddb.git
cd rosalinddb
docker compose up

The Control Plane comes up on http://localhost:8080 as the single public origin. MinIO, Postgres, Redis, and the validator / builder / ephemeral workers all run privately on the compose network. First boot pulls images and runs migrations; subsequent runs start in a few seconds.

Heads up: the OSS default has no auth. Do not expose :8080 to the public internet without flipping RB_REQUIRE_AUTH=true first. The CP logs a loud warning at startup if it detects a likely-public bind.

Step 2

Create a dataset and ingest

A dataset is a named, tenant-scoped collection of vectors with a fixed embedding dimension. The dimension is set at create time and immutable thereafter.

curl -X POST http://localhost:8080/v1/datasets \
  -H 'Content-Type: application/json' \
  -d '{"name": "products", "dimension": 4}'

Ingest a couple of vectors as NDJSON — one record per line, each with an id, a values array matching the dataset's dimension, and an optional metadata object:

curl -X POST http://localhost:8080/v1/datasets/products/vectors \
  -H 'Content-Type: application/x-ndjson' \
  --data-binary $'{"id":"a","values":[0.1,0.2,0.3,0.4],"metadata":{"category":"books"}}\n{"id":"b","values":[0.5,0.5,0.5,0.5],"metadata":{"category":"movies"}}\n'

Ingest returns immediately — the heavy work (validate, build the FAISS shard) runs on async workers behind a Redis reliable queue. Poll for readiness:

curl http://localhost:8080/v1/datasets/products
# wait until "status": "indexed"

Step 3

Query

Top-k nearest-neighbour search with an optional AND-of-equals metadata filter. The dataset returns the int64 FAISS hits translated back to your caller-supplied id + metadata via the sidecar.

curl -X POST http://localhost:8080/v1/query \
  -H 'Content-Type: application/json' \
  -d '{"dataset":"products","vector":[0.1,0.2,0.3,0.4],"top_k":2,"filter":{"category":"books"}}'

Response (HTTP 200):

{
  "matches": [
    { "id": "a", "score": 0.0, "metadata": {"category": "books"} }
  ]
}

Step 4

Where next

That's the whole API surface in three calls. From here:

Architecture — what's running under that one host port, and why.
Datasets — bulk import for uploads above 10 MiB, status lifecycle, the soft-delete model.
Query — nprobe, ephemeral results, the filter contract.
MCP server — point Claude / Cursor at your RosalindDB; agents call the REST API as MCP tools.

Browse the source on GitHub →