model context protocol · host

Watch an LLM actually
use your tools.

Relay connects to remote MCP servers and runs the orchestration loop itself — the model calls tools, reads the results, and decides what to do next, streamed step by step.

Launch Relay See how it works

Streamable HTTP · multi-server · provider-agnostic

orchestration stream

› thinking

tool_call github__search_issues { q: "streaming" }

tool_result → 3 open issues

tool_call github__get_issue { number: 58213 }

tool_result → "App Router streaming stalls behind proxy"

final → Top streaming issue is #58213 — proxy buffering.

The host orchestration loop

Most demos let a desktop client do this. Relay builds it: decide → call tool → feed the result back → decide again. Capped iterations, multiple tool calls per turn, and tool errors recovered without crashing the turn.

decide→ call tool→ feed result→ repeat✓ final answer

Streamed live

Every thinking step, tool call, and result streams to the UI as it happens — not a spinner.

Many servers

Aggregate tools across servers; names are namespaced and routed so two servers can both expose search.

Secure & pluggable

SSRF-guarded URLs, bearer-token auth, and a provider-agnostic adapter — Azure, OpenAI, Groq, Gemini.

Yours, persisted

How it works

The browser stays thin. Connections and model calls run server-side, reconnect-per-turn.

Connect servers

Add remote MCP URLs (optionally with a token). Tools are discovered over Streamable HTTP.

Run the loop

Your message plus the aggregated tools go to the model; results feed back until it answers.

Watch it stream

Each step renders live; the conversation persists to your account.

Launch Relay

Watch an LLM actuallyuse your tools.