RosalindDBRosalindDB
HomeDocsBlog
View RosalindDB on GitHub
View RosalindDB on GitHub
RosalindDBRosalindDB

An object-storage-first vector database for cold and bursty workloads. Apache 2.0.

View RosalindDB on GitHub

Project

  • GitHub
  • License (Apache 2.0)
  • Issues

Read

  • Documentation
  • MCP server
  • Blog

© 2026 RosalindDB contributors. Apache License 2.0.

Privacy

    Documentation

    • Quickstart
    • Architecture
    • Datasets
    • Query
    • MCP server
    • Multi-tenant mode
    • Authentication
    • Rate limits & quotas

    Docs

    Run RosalindDB in five minutes

    One docker compose up stands up the whole stack — Control Plane, Query Data Plane, three async workers, Postgres, Redis, and a MinIO bucket — on a single Docker network. No signup, no API key.

    Quickstart

    Four steps. Examples assume the default OSS run mode — RB_REQUIRE_AUTH off, single implicit default tenant, no auth headers. If you've flipped auth on, see Authentication for the bearer-token shape.

    Step 1

    Run the stack

    You need Docker and curl. Nothing else.

    git clone https://github.com/rosalinddb/rosalinddb.git
    cd rosalinddb
    docker compose up

    The Control Plane comes up on http://localhost:8080 as the single public origin. MinIO, Postgres, Redis, and the validator / builder / ephemeral workers all run privately on the compose network. First boot pulls images and runs migrations; subsequent runs start in a few seconds.

    Heads up: the OSS default has no auth. Do not expose :8080 to the public internet without flipping RB_REQUIRE_AUTH=true first. The CP logs a loud warning at startup if it detects a likely-public bind.

    Step 2

    Create a dataset and ingest

    A dataset is a named, tenant-scoped collection of vectors with a fixed embedding dimension. The dimension is set at create time and immutable thereafter.

    curl -X POST http://localhost:8080/v1/datasets \
      -H 'Content-Type: application/json' \
      -d '{"name": "products", "dimension": 4}'

    Ingest a couple of vectors as NDJSON — one record per line, each with an id, a values array matching the dataset's dimension, and an optional metadata object:

    curl -X POST http://localhost:8080/v1/datasets/products/vectors \
      -H 'Content-Type: application/x-ndjson' \
      --data-binary $'{"id":"a","values":[0.1,0.2,0.3,0.4],"metadata":{"category":"books"}}\n{"id":"b","values":[0.5,0.5,0.5,0.5],"metadata":{"category":"movies"}}\n'

    Ingest returns immediately — the heavy work (validate, build the FAISS shard) runs on async workers behind a Redis reliable queue. Poll for readiness:

    curl http://localhost:8080/v1/datasets/products
    # wait until "status": "indexed"

    Step 3

    Query

    Top-k nearest-neighbour search with an optional AND-of-equals metadata filter. The dataset returns the int64 FAISS hits translated back to your caller-supplied id + metadata via the sidecar.

    curl -X POST http://localhost:8080/v1/query \
      -H 'Content-Type: application/json' \
      -d '{"dataset":"products","vector":[0.1,0.2,0.3,0.4],"top_k":2,"filter":{"category":"books"}}'

    Response (HTTP 200):

    {
      "matches": [
        { "id": "a", "score": 0.0, "metadata": {"category": "books"} }
      ]
    }

    Step 4

    Where next

    That's the whole API surface in three calls. From here:

    • Architecture — what's running under that one host port, and why.
    • Datasets — bulk import for uploads above 10 MiB, status lifecycle, the soft-delete model.
    • Query — nprobe, ephemeral results, the filter contract.
    • MCP server — point Claude / Cursor at your RosalindDB; agents call the REST API as MCP tools.
    Browse the source on GitHub →

    On this page

    • Run the stack
    • Create a dataset & ingest
    • Query
    • Where next