Roadmap

Sharpened scope. A small working core activates contributors before breadth.

Phase	Content	Rationale
v0.1: Core	Portable `llama-server` + Local Chat + browser-only UI + `preflight-check` (FS/speed/RAM/GPU) + minimal-trace smoke test	A small working core activates contributors
v0.2: Council	Local role council + trace UI + demo prompts + bias guard	The wow moment, runnable on weak hardware
v0.3: Online	OpenRouter escalation + Privacy-Diff + cost cap + BYO keys (vault)	Premium depth on network, privacy consistent
v0.4: Evals	`evals/` Council-vs-Single + local multi-model Lab tier	Proof of the core claim
v0.5: Multimodal	Voice (whisper.cpp) + Vision (VLM) + Phone Access (TLS)	Only after the Council wow is proven
v0.6: Local Distribution	Optional Wi-Fi “hub mode”: a powered stick (e.g. on a power bank) serves signed model-packs, tools, and a local council API to paired nearby devices. A reverse proxy fronts one `/v1` endpoint, gated by a QR-paired bearer token	Field teams without per-device installs; internet-off LAN, not air-gapped
v1.0	Signed releases, update/rollback, landing page live, hardware matrix/CI	A dependable spec

VLM note: multimodal support in llama.cpp is version-fragile. “Just load a VLM GGUF” underestimates the work; hence deliberately late.

Not on this roadmap: browser-side WebGPU inference and a self-hosting board are scoped but unchosen. They live in explorations, not here, until an open question settles them.