Relay connects to remote MCP servers and runs the orchestration loop itself — the model calls tools, reads the results, and decides what to do next, streamed step by step.
Streamable HTTP · multi-server · provider-agnostic
Most demos let a desktop client do this. Relay builds it: decide → call tool → feed the result back → decide again. Capped iterations, multiple tool calls per turn, and tool errors recovered without crashing the turn.
Every thinking step, tool call, and result streams to the UI as it happens — not a spinner.
Aggregate tools across servers; names are namespaced and routed so two servers can both expose search.
SSRF-guarded URLs, bearer-token auth, and a provider-agnostic adapter — Azure, OpenAI, Groq, Gemini.
Sign in with Google or email; conversations save per-user in Firestore with full context memory.
The browser stays thin. Connections and model calls run server-side, reconnect-per-turn.
Add remote MCP URLs (optionally with a token). Tools are discovered over Streamable HTTP.
Your message plus the aggregated tools go to the model; results feed back until it answers.
Each step renders live; the conversation persists to your account.