MMGenBench

A comprehensive benchmark that evaluates Large Multimodal Models (LMMs) from an image generation perspective. Features MMGenBench-Test with 13 distinct image patterns and MMGenBench-Domain for domain-specific evaluation. Uses an automated pipeline where LMMs generate prompts from input images, which are then used by text-to-image models to recreate the original image.

from benchthing import Bench

bench = Bench("mmgenbench")

bench.run(
    benchmark="mmgenbench",
    task_id="1",
    models=['multimodal-model-1', 'multimodal-model-2']
)

result = bench.get_result("1")

MMGenBench

Sign up to get access to the MMGenBench benchmark API