Docs
Run RosalindDB in five minutes
One docker compose up stands up the whole stack — Control Plane, Query Data Plane, three async workers, Postgres, Redis, and a MinIO bucket — on a single Docker network. No signup, no API key.
Quickstart
Four steps. Examples assume the default OSS run mode — RB_REQUIRE_AUTH off, single implicit default tenant, no auth headers. If you've flipped auth on, see Authentication for the bearer-token shape.
Step 1
Run the stack
You need Docker and curl. Nothing else.
git clone https://github.com/rosalinddb/rosalinddb.git cd rosalinddb docker compose up
The Control Plane comes up on http://localhost:8080 as the single public origin. MinIO, Postgres, Redis, and the validator / builder / ephemeral workers all run privately on the compose network. First boot pulls images and runs migrations; subsequent runs start in a few seconds.
Heads up: the OSS default has no auth. Do not expose :8080 to the public internet without flipping RB_REQUIRE_AUTH=true first. The CP logs a loud warning at startup if it detects a likely-public bind.
Step 2
Create a dataset and ingest
A dataset is a named, tenant-scoped collection of vectors with a fixed embedding dimension. The dimension is set at create time and immutable thereafter.
curl -X POST http://localhost:8080/v1/datasets \
-H 'Content-Type: application/json' \
-d '{"name": "products", "dimension": 4}'Ingest a couple of vectors as NDJSON — one record per line, each with an id, a values array matching the dataset's dimension, and an optional metadata object:
curl -X POST http://localhost:8080/v1/datasets/products/vectors \
-H 'Content-Type: application/x-ndjson' \
--data-binary $'{"id":"a","values":[0.1,0.2,0.3,0.4],"metadata":{"category":"books"}}\n{"id":"b","values":[0.5,0.5,0.5,0.5],"metadata":{"category":"movies"}}\n'Ingest returns immediately — the heavy work (validate, build the FAISS shard) runs on async workers behind a Redis reliable queue. Poll for readiness:
curl http://localhost:8080/v1/datasets/products # wait until "status": "indexed"
Step 3
Query
Top-k nearest-neighbour search with an optional AND-of-equals metadata filter. The dataset returns the int64 FAISS hits translated back to your caller-supplied id + metadata via the sidecar.
curl -X POST http://localhost:8080/v1/query \
-H 'Content-Type: application/json' \
-d '{"dataset":"products","vector":[0.1,0.2,0.3,0.4],"top_k":2,"filter":{"category":"books"}}'Response (HTTP 200):
{
"matches": [
{ "id": "a", "score": 0.0, "metadata": {"category": "books"} }
]
}Step 4
Where next
That's the whole API surface in three calls. From here:
- Architecture — what's running under that one host port, and why.
- Datasets — bulk import for uploads above 10 MiB, status lifecycle, the soft-delete model.
- Query —
nprobe, ephemeral results, the filter contract. - MCP server — point Claude / Cursor at your RosalindDB; agents call the REST API as MCP tools.