4

MCP · MULTI-AI · SEMANTIC CLUSTERING

MCP server generator built for ACM UTD — multi-AI architecture with semantic tool clustering to cut context cost.

ORG
ACM @ UTD · team of 4
ROLE
LLM-to-tools integration, app-agent layer
STAGE
LIVE — helios-public.vercel.app
STACK
Next.js 16 · Express · Mongo · Anthropic

You hand Helios an OpenAPI spec and a sentence describing what you actually want to build. Helios reads both, decides which endpoints belong together, names them as semantic tools, lets you chat with a sandboxed version of the resulting server before you commit — then ships a ready-to-deploy TypeScript MCP server as a ZIP. A compiler from “this API and what I want to do with it” to “an MCP server an agent can use without choking on its own tool list.”

TRY IT LIVE — HELIOS-PUBLIC.VERCEL.APP ↗

Helios landing page — MCP Server Generator with the Build Your Server call to action
FIG · 01THE LIVE GENERATOR — BRING YOUR OWN KEY, RUNS ENTIRELY IN YOUR TAB'S SESSION

MCP is infrastructure now — donated to the Linux Foundation, 9,400+ registered servers, ~33 M weekly SDK downloads. But it has a known failure mode the community calls the MCP Tax: a naive server registers every endpoint as its own tool at 200–500 tokens per schema. Five naive servers can dump 15,000–60,000 tokens of overhead into context before the user’s first message — and when twelve tools have near-identical names, models make measurably worse selection decisions.

Existing generators (Speakeasy, Stainless, FastMCP, AWS Labs) are all structural — one tool per endpoint, maybe grouped by tag. None of them run an LLM over user intent to decide clustering and naming. That’s the gap.

AIJOBWHEN IT RUNS
Generation AIScores endpoint relevance against user intent, clusters endpoints into semantic tools, names and describes each clusterOnce, after spec upload + intent
Sandbox AIReads the generated catalog, takes chat messages, picks tools, calls the live API (GET-only), returns inspectable resultsContinuously, during preview

The generated server itself is not an AI — it’s a clean, deterministic Node.js dispatch layer. You don’t depend on Helios at runtime. And the loop closes: after sandbox testing you can rename clusters, drop endpoints, merge tools, and regenerate — most generators emit code and stop.

Three endpoints — list_customers, search_customers, get_all_customers — collapse into one query_customers tool with a mode parameter: ~70% fewer tokens for the cluster. Across a 100-endpoint API, intent-driven grouping cuts schema tokens 60–80%, and converts the model’s decision from “which of these 12 similar endpoints” to “which capability.”

The sandbox is GET-only by design — you can explore a live API through your generated tools without any possibility of mutating someone’s production data during preview.

Deployed and public at helios-public.vercel.app — bring your own Anthropic key; everything runs in your tab’s session. The built-in info reference covers the whole territory: what an MCP server is, the JSON-RPC flow, where to find API specs and keys, and setup references for every pre-made template.

Helios info page — eight-section reference covering MCP servers, API specs, keys and OAuth setup
FIG · 02THE BUILT-IN REFERENCE — 8 SECTIONS FROM 'WHAT IS MCP' TO PER-TEMPLATE OAUTH SETUP