Skip to content
AImpact
IT EN
High Multimodal AI · 1 min read

InternVL 2.5: 78B open source that beats GPT-4V on OCR and math

In one sentence Shanghai AI Lab releases InternVL 2.5 with 78B parameters under Apache 2.0, achieving SOTA on MathVista, OCRBench, and ChartQA, surpassing GPT-4V on numerous multimodal benchmarks.

Verified Official source
ShareLinkedInX
Reading level

InternVL 2.5 is the most capable open-source VLM at its release: 78 billion parameters with an Apache 2.0 license, meaning free for commercial use. It beats GPT-4V on visual math tests, reading text in images, and understanding charts and tables. For the first time, an open-weight model surpasses the best proprietary models on multiple benchmarks simultaneously, demonstrating that open source can compete at the highest level in multimodal AI.

Companies

Shanghai AI Lab

Tools

InternVL 2.5, InternVL2.5-78B

Tags

VLMSOTAMathOCRChart UnderstandingOpen Source

Sources