← Back to 2026-02-06
RAGFastAPIPythonArchitectureMulti-tenancyDocker

Building MiniRAG: A Modular RAG Platform - Progress Report #1

Oli·

Building MiniRAG: A Modular RAG Platform - Progress Report #1

I'm currently neck-deep in building MiniRAG - a modular, provider-agnostic Retrieval-Augmented Generation (RAG) platform designed with multi-tenancy from the ground up. Two weeks in, and I've hit some interesting challenges that I think other developers might find valuable. Here's where we stand and what I've learned so far.

The Vision

MiniRAG aims to solve a common problem: most RAG implementations are tightly coupled to specific providers (OpenAI, Anthropic, etc.) and lack proper multi-tenant architecture. I wanted something that could:

  • Support multiple LLM providers seamlessly
  • Handle multiple tenants with proper data isolation
  • Scale horizontally with background workers
  • Provide a clean API for integration

What's Built So Far

Step 1: Foundation Architecture ✅

The foundation is solid. I went with a modern Python stack:

app/
├── api/v1/          # FastAPI routes
├── core/            # Config, database, security
├── models/          # SQLModel definitions
├── services/        # Business logic
└── workers/         # Background tasks

Key architectural decisions:

  • FastAPI for async API performance
  • SQLModel (SQLAlchemy 2.0) for modern async ORM
  • Qdrant for vector storage
  • Redis for caching and task queues
  • PostgreSQL for relational data

The Docker setup includes full health checks across all services:

# docker-compose.yml snippet
services:
  postgres:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-postgres}"]
      interval: 10s
      timeout: 5s
      retries: 5

  qdrant:
    image: qdrant/qdrant:v1.13.0
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:6333/health"]

Step 2: Authentication & Multi-tenancy ✅

This was the trickier part. I built a robust auth system supporting both API tokens and JWT:

# Core auth context
@dataclass
class AuthContext:
    tenant: Tenant
    user: User | None = None
    api_token: ApiToken | None = None
    
    @property
    def is_tenant_owner(self) -> bool:
        return self.user and self.user.role == UserRole.OWNER

The auth flow supports:

  • Bootstrap endpoint: POST /v1/tenants creates tenant + owner + initial API token
  • API token management: Create, list, and revoke tokens (all tenant-scoped)
  • Automatic tenant isolation: Every query auto-filters by tenant_id

Lessons Learned (The Hard Way)

1. Hatch Build Configuration Gotcha

The Problem: Tried to install the package with pip install -e ".[dev]" and got:

ValueError: Unable to determine which files to ship inside the wheel

The Fix: When your project name differs from your package directory, Hatch needs explicit configuration:

# pyproject.toml
[tool.hatch.build.targets.wheel]
packages = ["app"]

This one cost me an hour of debugging. Always configure your build targets explicitly!

2. FastAPI HTTPBearer Behavior Change

The Problem: Expected 403 Forbidden for missing Authorization header, got 401 Unauthorized.

The Reality: Modern FastAPI's HTTPBearer returns 401 for missing auth headers, not 403. The distinction matters for API contract testing:

# Robust test assertion
assert resp.status_code in (401, 403)  # Handle both cases

3. Async SQLAlchemy Dependencies

The Problem:

ValueError: the greenlet library is required

The Fix: When using async SQLAlchemy with aiosqlite, you need both:

# pyproject.toml
dependencies = [
    "greenlet>=3",
    "aiosqlite>=0.20",
    # ... other deps
]

This isn't always obvious from the documentation, but greenlet is essential for async context switching in SQLAlchemy.

Current Status: Ready for Step 3

All tests are passing, and the foundation is solid:

$ pytest tests/test_auth_flow.py -v
tests/test_auth_flow.py::test_bootstrap_tenant PASSED
tests/test_auth_flow.py::test_duplicate_tenant_slug PASSED  
tests/test_auth_flow.py::test_invalid_api_token PASSED
tests/test_auth_flow.py::test_missing_authorization PASSED

The auth system handles:

  • Tenant bootstrapping with duplicate slug protection
  • Secure API token storage (SHA-256 hashed, never stored in plaintext)
  • Proper 401 responses for invalid/missing tokens

What's Next: Bot Profiles & Sources

Step 3 focuses on the core RAG functionality:

  1. BotProfile model: Store LLM provider configurations with encrypted credentials
  2. Source model: Manage document sources with proper tenant isolation
  3. CRUD endpoints: Full REST APIs for both entities
  4. Integration tests: Ensure multi-tenant data isolation works correctly

The interesting challenge here will be the encrypted credentials field for bot profiles. I'm planning to use Fernet encryption with tenant-specific keys to ensure even database-level isolation.

Technical Decisions Worth Noting

Why SQLModel Over Pure SQLAlchemy?

SQLModel provides the best of both worlds - Pydantic validation with SQLAlchemy power. The schema definitions are clean:

class TenantCreate(SQLModel):
    name: str = Field(min_length=1, max_length=255)
    slug: str = Field(min_length=1, max_length=100, regex=r"^[a-z0-9-]+$")

class Tenant(TenantCreate, table=True):
    id: UUID = Field(default_factory=new_uuid, primary_key=True)
    created_at: datetime = Field(default_factory=utcnow)

Why Provider-Agnostic from Day One?

Most RAG implementations start with one provider and bolt on others later. I've seen this pattern fail too many times. Starting with abstraction from the beginning means cleaner interfaces and easier testing.

Wrapping Up

Building MiniRAG has been a great exercise in modern Python architecture. The async-first approach with proper multi-tenancy is paying dividends already, and I haven't even hit the complex RAG logic yet.

The key takeaway? Start with the hard parts first. Auth, multi-tenancy, and proper data isolation are much easier to build from the ground up than to retrofit later.

Stay tuned for the next progress report where I'll dive into the RAG pipeline architecture and vector storage patterns.


Want to follow along with the build? I'll be sharing more technical deep-dives and lessons learned as MiniRAG evolves. Hit me up if you've tackled similar multi-tenant RAG challenges - I'd love to hear about your approach!