Reference Set and MLLM Visual Information Extraction Prototype
Authors/Creators
Description
Manually extracted data from figures and tables reported in pharmacology studies previously collected as part of an internal reference set. The dataset contains images from the visual elements and their corresponding values from 45 published pharmacology studies with distinct PubMed IDs. Multiple images could be sampled from each of these papers, and multiple values were often sampled from each image. Therefore, the reference set contains multiple rows with values from images of tables or from figures corresponding to graphs, plots, or charts. Dataset contains annotated information from 43 images of figueres and 40 images of tables. The visual elements contain data from any of eight different types of experiments, namely in vitro enzyme inhibition, induction, & kinetics, in vitro transporter inhibition, induction, and kinetics, as well as in vivo enzyme kinetics and in vivo interaction studies. The selected sample represents a wide range of styles, layouts, and structures for both figures and tables.
We also provide code from our MLLLMs Visual Information Extraction prototype using the Pydantic AI v1.25 Python module to connect with multiple models to perform VIE and produce a structured JSON output. Our pilot VIE system was used to process images from the reference set along with the rest of the annotated information to generate prompts.
We have evaluated the following models.
|
Inference Provider |
Model Company |
Model Name |
Context Window |
Number of Parameters |
|
AWS Bedrock |
Anthropic |
Claude Sonnet 3.7 |
128K |
* |
|
Claude Sonnet 4.0 |
1M |
* |
||
|
AWS |
Nova Pro |
300K |
* |
|
|
Nova Premier |
1M |
* |
||
|
Meta |
Llama 3.2 |
128K |
90B |
|
|
Llama 4 Scout |
10M |
109B |
||
|
Llama 4 Maverick |
1M |
400B |
||
|
Open AI API |
Open AI |
GPT-4o |
128K |
* |
|
GPT-5 |
400K |
* |
||
|
Google Vertex |
|
Gemini 2.5 Pro |
1M |
* |
|
*The actual number of parameters for this model has not been made publicly available. |
||||
Error corrections:
Within the "Manuscript Results folder" > "Tolerance Based ACC.ods" the calculation of Tolerance based accuracy for cells F12-J21 was incorrectly calculated by dividing the corresponding cell F1-J10 over 172 instead of 162. For example the correction for the value of cell F12 is to change it's content from "=ROUND(F1/172,3)*100" to "=ROUND(F1/162,3)*100".
Files
Complexity.ipynb
Files
(1.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c8e8891a15590faa9cd0fbb334d8e228
|
756.9 kB | Preview Download |
|
md5:5f96a3619bfeb588ab38ca783113148f
|
3.8 kB | Download |
|
md5:a6b64aef5b613d87b8bfa9ea3abb1db3
|
9.3 kB | Download |
|
md5:7a4ac1898ec56b38990c15b0171b5ff8
|
37.0 kB | Download |
|
md5:6c0bdb263bfda5d3e7bb84bf81aa7e59
|
7.3 kB | Download |
|
md5:c40da088a59f49af7bd7fa85f16890e9
|
265.5 kB | Preview Download |
|
md5:a6731cd414ee993be7e7d084a4338b3d
|
607.8 MB | Preview Download |
|
md5:9f56152a733b5467b3ac516530d91d80
|
1.9 kB | Download |
|
md5:5de30d230258b59c01b84bf8eaef69cf
|
832.5 MB | Preview Download |
|
md5:00b14652d428218ccb572d2dcef16c5d
|
4.1 kB | Download |
Additional details
Software
- Repository URL
- https://github.com/dbmi-pitt/visual-info-extraction
- Programming language
- Python
- Development Status
- Concept