Architecture

RosalindDB is built around four structural decisions: object storage is the source of truth, the public surface is split from the search surface, ingest is asynchronous, and the builder is safe to scale horizontally. The rest is mostly plumbing.

Object storage is the source of truth

The authoritative copy of every shard lives in S3-compatible object storage. Query-DP keeps a per-process LRU of deserialised FAISS indexes — fast — but holds no authoritative state of its own. Kill a Query-DP replica, start a new one, and the worst case is a few hundred milliseconds of cold reads from S3 until the cache is warm again.

┌──────────────────────────────────────────────────────────────────┐
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒ object storage is the source of truth ▒▒▒▒▒▒▒▒▒▒▒▒▒│
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ╔══════════════╗            ╔════════════════════════╗         │
│   ║              ║▒           ║                        ║▒        │
│   ║   ingest     ║──writes───▶║     Object Storage     ║▒        │
│   ║   workers    ║▒           ║  landing/    indexes/  ║▒        │
│   ║              ║▒           ║                        ║▒        │
│   ╚══════════════╝▒           ╚═══════════╦════════════╝▒        │
│    ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒            ▒▒▒▒▒▒▒▒▒▒▒│▒▒▒▒▒▒▒▒▒▒▒▒▒▒        │
│                                            │ read on cache miss  │
│                                            ▼                     │
│                             ╔═════════════════════════╗          │
│                             ║                         ║▒         │
│                             ║        Query-DP         ║▒         │
│                             ║    in-process shard     ║▒         │
│                             ║   cache  (byte-budget)  ║▒         │
│                             ╚═════════════════════════╝▒         │
│                              ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒          │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

The cache is bounded by bytes, not entry count. A 1k-vector shard and a 1M-vector shard differ by ~100× in memory footprint; a count cap can't pin Query-DP memory, so we don't use one. Default budget is 512 MB (RB_SHARD_CACHE_BYTES), evicted LRU. The same property — no authoritative DP state — is what lets the query tier scale to zero between bursts.

Control plane / data plane

The Control Plane is the only public surface. It handles auth, dataset CRUD, and ingest admission, then proxies /v1/query to a private Data Plane. The DP trusts a single header — X-RB-Tenant-Id — and does no auth of its own.

┌──────────────────────────────────────────────────────────────────┐
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ control plane / data plane ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│                      ┌──────────────┐                            │
│                      │    client    │                            │
│                      └──────┬───────┘                            │
│                             │ HTTPS                              │
│                             ▼                                    │
│   ╔══════════════════════════════════════════════════════════╗   │
│   ║                                                          ║▒  │
│   ║                  Control Plane  (CP)                     ║▒  │
│   ║    auth · dataset CRUD · ingest admit · reverse proxy    ║▒  │
│   ║                                                          ║▒  │
│   ╚═════════════════════════╦════════════════════════════════╝▒  │
│    ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒   │
│                             │ X-RB-Tenant-Id   (verified)        │
│                             │ X-RB-Proxy-Secret  (defence)       │
│                             ▼                                    │
│                 ╔═══════════════════════════╗                    │
│                 ║                           ║▒                   │
│                 ║         Query-DP          ║▒                   │
│                 ║      FAISS · cache        ║▒                   │
│                 ║                           ║▒                   │
│                 ╚═══════════════════════════╝▒                   │
│                  ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒                    │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

The point of the split is twofold. Latency: the search code is never blocked behind a tenant lookup or a rate-limit dependency. Blast radius: the DP lives on a network the public internet can't reach. A shared-secret header (X-RB-Proxy-Secret) is the defence-in-depth layer on top, enforced when set on both ends.

Async ingest pipeline

There is no synchronous CP→DP hop on ingest. The CP's job ends at LPUSH. Kill every worker right now and the API will keep accepting writes — status: validating stays pinned in the catalog until you bring workers back, but nothing is lost.

┌──────────────────────────────────────────────────────────────────┐
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ async ingest pipeline ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ╔══════════╗                                                    │
│  ║    CP    ║▒   parses NDJSON · writes landing part · LPUSH     │
│  ╚════╦═════╝▒                                                   │
│   ▒▒▒▒│▒▒▒▒▒▒▒                                                   │
│       │ VALIDATE_DATASET                                         │
│       ▼                                                          │
│  ╔══════════╗                                                    │
│  ║  Redis   ║▒   reliable queue  (LMOVE · processing · reaper)   │
│  ╚════╦═════╝▒                                                   │
│   ▒▒▒▒│▒▒▒▒▒▒▒                                                   │
│       │                                                          │
│       ▼                                                          │
│  ┌──────────────┐    parquet      ┌──────────────────────────┐   │
│  │  validator   │────────────────▶│    landing/.../part-*    │   │
│  └──────┬───────┘                 └──────────────────────────┘   │
│         │ DATASET_READY                                          │
│         ▼                                                        │
│  ┌──────────────┐    shard.bin    ┌──────────────────────────┐   │
│  │   builder    │────────────────▶│   indexes/.../shard.bin  │   │
│  │              │    catalog row  ┌──────────────────────────┐   │
│  │              │────────────────▶│   shard_catalog (PG)     │   │
│  └──────────────┘                 └──────────────────────────┘   │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

Redis is used as a reliable queue, not pub/sub. Publish is LPUSH; consume is an atomic LMOVE onto a per-topic processing list; ack is LREM. A worker that crashes mid-handler leaves the message on the processing list, where a periodic reaper recovers it after QUEUE_RECLAIM_TIMEOUT. Delivery is at-least-once; every handler is idempotent.

The builder lock

The index builder is safe to scale horizontally because it serialises per-dataset builds through a Postgres advisory lock. Two replicas can pull a DATASET_READY for the same dataset; both call pg_try_advisory_lock on a hash of (tenant, dataset); the loser NACKs back to the queue and runs once the holder finishes.

┌──────────────────────────────────────────────────────────────────┐
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ the builder lock ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│           ╔════════════════════════════════════╗                 │
│           ║      Redis · DATASET_READY         ║▒                │
│           ║      dataset = "products"          ║▒                │
│           ╚══════╦═══════════╦═══════════╦═════╝▒                │
│            ▒▒▒▒▒▒│▒▒▒▒▒▒▒▒▒▒▒│▒▒▒▒▒▒▒▒▒▒▒│▒▒▒▒▒▒▒                │
│                  │           │           │                       │
│                  ▼           ▼           ▼                       │
│             ┌────────┐  ┌────────┐  ┌────────┐                   │
│             │builder1│  │builder2│  │builder3│  (all 3 pulled)   │
│             └────┬───┘  └────┬───┘  └────┬───┘                   │
│                  │           │           │                       │
│                  └───────────┼───────────┘                       │
│                              │                                   │
│                 pg_try_advisory_lock(hash("products"))           │
│                              │                                   │
│                              ▼                                   │
│                      ╔═══════════════╗                           │
│                      ║    winner     ║▒  builds                  │
│                      ║   builder2    ║▒  losers NACK to queue    │
│                      ╚═══════════════╝▒                          │
│                       ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒                          │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

The same advisory-lock pattern guards schema migrations on startup — only one process applies the next migration, the rest see the post-migration schema. This is why PgBouncer in transaction-pooling mode is a footgun: the lock is released between statements and two builders end up racing on the same dataset. Use session-pooling, or skip PgBouncer for the catalog connection.

Two env switches

The same image runs the OSS install and a multi-tenant self-host. Two env vars decide which mode — there is no separate build, no feature flag service, no enterprise edition.

Var	Off (OSS default)	On (multi-tenant self-host)
RB_REQUIRE_AUTH	`/auth/*` returns 404. Every request is attributed to a single default tenant. No JWT, no API keys.	Signup, JWT issuance, `rb_live_…` API keys, per-tenant isolation.
RB_ENABLE_QUOTAS	Rate limit is a no-op. `GET /auth/usage` returns `{ enabled: false }`.	Token-bucket rate limiter, daily query cap, ingest admission quota.

Neither switch is a build flag. Flipping either at runtime and restarting the CP is sufficient. The CP logs a loud warning at startup if it detects a likely-public bind with RB_REQUIRE_AUTH off.

Read further

This page is the shape. Every queue topic, every advisory-lock key, the full configuration surface, observability wiring, and the production-hardening checklist live in the backend repo:

docs/architecture/architecture.md — the engineering source of truth.
docs/indexing.md — incremental-append builder behaviour.
docs/deploy/self-host.md — the self-host runbook (production hardening, env vars, gotchas).