RouterBench

A novel evaluation framework designed to systematically assess the efficacy of LLM routing systems, along with a comprehensive dataset comprising over 405k inference outcomes from representative LLMs. It provides a theoretical framework for LLM routing and delivers comparative analysis of various routing approaches, setting a standard for assessment of multi-LLM deployments.

from benchthing import Bench

bench = Bench("router-bench")

bench.run(
    benchmark="router-bench",
    task_id="1",
    models=['model-router-1', 'model-router-2']
)

result = bench.get_result("1")

RouterBench

Sign up to get access to the RouterBench benchmark API