A benchmark for evaluating the degree of semantic equivalence between two text snippets. It comprises datasets from SemEval tasks (2012-2017) including text from image captions, news headlines, and user forums. STS measures graded bidirectional similarity, useful for various NLP tasks such as MT evaluation, information extraction, and summarization.
import { Bench } from 'benchthing';
const bench = new Bench('sts');
await bench.run({
benchmark: 'sts',
taskId: '1',
models: yourEmbeddingModels,
});
const result = await bench.getResult('1');