Question 1

How does routing save up to 60%?

Accepted Answer

Each request goes to the cheapest model that meets the quality bar. Short tasks hit small models; reasoning tasks hit frontier. Prompt caching is enforced. On typical workloads, savings run up to 60%.

Question 2

Does routing hurt quality?

Accepted Answer

No. The router only picks a cheaper model when it clears the task's difficulty bar. Frontier tasks still go to frontier models.

Question 3

Can I pin a model?

Accepted Answer

Yes. Pin per request or per instance. The router respects the override.

Question 4

How is savings measured?

Accepted Answer

Every request logs the routed cost versus the counterfactual of pinning one frontier model. The dashboard shows the rolling average.

LLMs, up to 60% cheaper.

Routes every request

300+ models

Caching enforced

Measured savings

Override when you need it

BYOK or managed

FAQ

How does routing save up to 60%?

Does routing hurt quality?

Can I pin a model?

How is savings measured?

Ready to launch a managed instance?