Published December 2, 2025 | Version 1.0.0
Dataset Open

Reference Set and MLLM Visual Information Extraction Prototype

  • 1. ROR icon University of Pittsburgh School of Medicine
  • 2. ROR icon University of Pittsburgh

Contributors

  • 1. ROR icon Rochester Institute of Technology

Description

Manually extracted data from figures and tables reported in pharmacology studies previously collected as part of an internal reference set. The dataset contains images from the visual elements and their corresponding values from 45 published pharmacology studies with distinct PubMed IDs. Multiple images could be sampled from each of these papers, and multiple values were often sampled from each image. Therefore, the reference set contains multiple rows with values from images of tables or from figures corresponding to graphs, plots, or charts. Dataset contains annotated information from 43 images of figueres and 40 images of tables. The visual elements contain data from any of eight different types of experiments, namely in vitro enzyme inhibition, induction, & kinetics, in vitro transporter inhibition, induction, and kinetics, as well as in vivo enzyme kinetics and in vivo interaction studies. The selected sample represents a wide range of styles, layouts, and structures for both figures and tables.

 

We also provide code from our MLLLMs Visual Information Extraction prototype using the Pydantic AI v1.25 Python module to connect with multiple models to perform VIE and produce a structured JSON output. Our pilot VIE system was used to process images from the reference set along with the rest of the annotated information to generate prompts. 

We have evaluated the following models.

Inference Provider

Model Company

Model Name

Context Window

Number of Parameters

AWS Bedrock

Anthropic

Claude Sonnet 3.7

128K

*

Claude Sonnet 4.0

1M

*

AWS

Nova Pro

300K

*

Nova Premier

1M

*

Meta

Llama 3.2

128K

90B

Llama 4 Scout

10M

109B

Llama 4 Maverick

1M

400B

Open AI API

Open AI

GPT-4o

128K

*

GPT-5

400K

*

Google Vertex

Google

Gemini 2.5 Pro

1M

*

*The actual number of parameters for this model has not been made publicly available.

 

Error corrections:

Within the "Manuscript Results folder" > "Tolerance Based ACC.ods" the calculation of Tolerance based accuracy for cells F12-J21 was incorrectly calculated by dividing the corresponding cell F1-J10 over 172 instead of 162. For example the correction for the value of cell F12 is to change  it's content from "=ROUND(F1/172,3)*100" to "=ROUND(F1/162,3)*100".

Files

Complexity.ipynb

Files (1.4 GB)

Name Size Download all
md5:c8e8891a15590faa9cd0fbb334d8e228
756.9 kB Preview Download
md5:5f96a3619bfeb588ab38ca783113148f
3.8 kB Download
md5:a6b64aef5b613d87b8bfa9ea3abb1db3
9.3 kB Download
md5:7a4ac1898ec56b38990c15b0171b5ff8
37.0 kB Download
md5:6c0bdb263bfda5d3e7bb84bf81aa7e59
7.3 kB Download
md5:c40da088a59f49af7bd7fa85f16890e9
265.5 kB Preview Download
md5:a6731cd414ee993be7e7d084a4338b3d
607.8 MB Preview Download
md5:9f56152a733b5467b3ac516530d91d80
1.9 kB Download
md5:5de30d230258b59c01b84bf8eaef69cf
832.5 MB Preview Download
md5:00b14652d428218ccb572d2dcef16c5d
4.1 kB Download

Additional details

Software

Repository URL
https://github.com/dbmi-pitt/visual-info-extraction
Programming language
Python
Development Status
Concept