AI Sidecar Split
The AI module chain (graph, aimodels, rag, agents) can optionally run as a standalone sidecar service (cmd/ai-service/) separate from the monolith. Controlled by the AI_SERVICE_URL env var on the monolith:
- Empty (default) — all modules run in-process in the monolith. No change from baseline.
- Set (e.g.,
http://orkestra-ai:3100) — the monolith skips registeringgraph/aimodels/rag/agentsand instead registersRemoteAIModelProvider+RemoteRAGQueryProvider(HTTP clients) under the sameServiceRegistrykeys. Consumer modules likesalesuse the sameGetTypedpattern — zero code changes.
How the split works
Design constraints
- Both binaries live in the same Go module (
backend/) — no code duplication, sharedinternal/packages. - The AI service uses
JWTValidator(public key only) instead ofAuthMiddleware(which depends on the auth module). Both satisfymodule.RoleMiddleware. - Internal API endpoints (
/v1/internal/*) are the service-to-service contract. They serialize theiface.AIModelProviderandiface.RAGQueryProvidermethod calls as HTTP request/response. - Streaming (
/v1/rag/query/stream) goes directly from frontend → AI service, never proxied through the monolith. - The feature flag is fully backward compatible and K8s-ready (service DNS, Ingress routing, separate
Deploymentwith independent scaling).
Running split mode in dev
cd docker
docker compose -f docker-compose.infra.yml up -d
AI_SERVICE_URL=http://orkestra-ai-dev:3100 \
docker compose -f docker-compose.dev.yml --env-file .env up -d
docker compose -f docker-compose.ai-sidecar.yml --env-file .env up -d