Skip to content
New

EDDI v6.0.0-RC1 is now available for developer preview! Get started

Get Started

Smart Model Cascading

Cost-optimized multi-model routing — try cheap models first, escalate on low confidence. Per-conversation budgets and tenant cost ceilings.

Smart Model Cascading

Intelligent Cost Optimization

EDDI's Model Cascading system enables cost-aware multi-model routing. Start with fast, inexpensive models and automatically escalate to more powerful (and expensive) models only when confidence is low — reducing AI costs without sacrificing quality.

Cascading Features

How It Works

Configure a cascade chain of models ordered by cost. For each user message, EDDI tries the cheapest model first and evaluates confidence. If confidence falls below the threshold, it automatically escalates to the next model in the chain. This approach can reduce LLM costs by 60-80% for typical workloads where most queries are simple enough for smaller models.