BIRD-SQL

BIRD (Big Bench for Large-scale Database Grounded Text-to-SQL Evaluation) contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB. It covers more than 37 professional domains, designed to evaluate the performance of text-to-SQL models on large-scale, real-world databases.

import { Bench } from 'benchthing';

const bench = new Bench('bird-sql');

await bench.run({
  benchmark: 'bird-sql',
  taskId: '1',
  models: ['text-davinci-003', 'gpt-3.5-turbo'],
});

const result = await bench.getResult('1');

BIRD-SQL

Sign up to get access to the BIRD-SQL benchmark API