Roadmap
Sharpened scope. A small working core activates contributors before breadth.
| Phase | Content | Rationale |
|---|---|---|
| v0.1: Core | Portable llama-server + Local Chat + browser-only UI + preflight-check (FS/speed/RAM/GPU) + minimal-trace smoke test | A small working core activates contributors |
| v0.2: Council | Local role council + trace UI + demo prompts + bias guard | The wow moment, runnable on weak hardware |
| v0.3: Online | OpenRouter escalation + Privacy-Diff + cost cap + BYO keys (vault) | Premium depth on network, privacy consistent |
| v0.4: Evals | evals/ Council-vs-Single + local multi-model Lab tier | Proof of the core claim |
| v0.5: Multimodal | Voice (whisper.cpp) + Vision (VLM) + Phone Access (TLS) | Only after the Council wow is proven |
| v0.6: Local Distribution | Optional Wi-Fi “hub mode”: a powered stick (e.g. on a power bank) serves signed model-packs, tools, and a local council API to paired nearby devices. A reverse proxy fronts one /v1 endpoint, gated by a QR-paired bearer token | Field teams without per-device installs; internet-off LAN, not air-gapped |
| v1.0 | Signed releases, update/rollback, landing page live, hardware matrix/CI | A dependable spec |
VLM note: multimodal support in llama.cpp is version-fragile. “Just load a VLM GGUF” underestimates the work; hence deliberately late.
Not on this roadmap: browser-side WebGPU inference and a self-hosting board are scoped but unchosen. They live in explorations, not here, until an open question settles them.