MCP · MULTI-AI · SEMANTIC CLUSTERING
MCP server generator built for ACM UTD — multi-AI architecture with semantic tool clustering to cut context cost.
- ORG
- ACM @ UTD · team of 4
- ROLE
- LLM-to-tools integration, app-agent layer
- STAGE
- LIVE — helios-public.vercel.app
- STACK
- Next.js 16 · Express · Mongo · Anthropic
You hand Helios an OpenAPI spec and a sentence describing what you actually want to build. Helios reads both, decides which endpoints belong together, names them as semantic tools, lets you chat with a sandboxed version of the resulting server before you commit — then ships a ready-to-deploy TypeScript MCP server as a ZIP. A compiler from “this API and what I want to do with it” to “an MCP server an agent can use without choking on its own tool list.”
TRY IT LIVE — HELIOS-PUBLIC.VERCEL.APP ↗

MCP is infrastructure now — donated to the Linux Foundation, 9,400+ registered servers, ~33 M weekly SDK downloads. But it has a known failure mode the community calls the MCP Tax: a naive server registers every endpoint as its own tool at 200–500 tokens per schema. Five naive servers can dump 15,000–60,000 tokens of overhead into context before the user’s first message — and when twelve tools have near-identical names, models make measurably worse selection decisions.
Existing generators (Speakeasy, Stainless, FastMCP, AWS Labs) are all structural — one tool per endpoint, maybe grouped by tag. None of them run an LLM over user intent to decide clustering and naming. That’s the gap.
| AI | JOB | WHEN IT RUNS |
|---|---|---|
| Generation AI | Scores endpoint relevance against user intent, clusters endpoints into semantic tools, names and describes each cluster | Once, after spec upload + intent |
| Sandbox AI | Reads the generated catalog, takes chat messages, picks tools, calls the live API (GET-only), returns inspectable results | Continuously, during preview |
The generated server itself is not an AI — it’s a clean, deterministic Node.js dispatch layer. You don’t depend on Helios at runtime. And the loop closes: after sandbox testing you can rename clusters, drop endpoints, merge tools, and regenerate — most generators emit code and stop.
Three endpoints — list_customers, search_customers, get_all_customers — collapse into one query_customers tool with a mode parameter: ~70% fewer tokens for the cluster. Across a 100-endpoint API, intent-driven grouping cuts schema tokens 60–80%, and converts the model’s decision from “which of these 12 similar endpoints” to “which capability.”
The sandbox is GET-only by design — you can explore a live API through your generated tools without any possibility of mutating someone’s production data during preview.
Deployed and public at helios-public.vercel.app — bring your own Anthropic key; everything runs in your tab’s session. The built-in info reference covers the whole territory: what an MCP server is, the JSON-RPC flow, where to find API specs and keys, and setup references for every pre-made template.
