Published October 7, 2025 | Version v1.0.1
Software Open

PicAxe

Description

PicAxe 1.0.1 Release Notes:

Monday, October 6th, 2025

Welcome to PicAxe v1.0.1! We've made some small patches, specifically updates to the READMEs and Docker images for both pipelines for more seamless installation and use. We've also added a visual flowchart to the main branch README.

This release of PicAxe was updated by: Qilin Zhou and Bruno Felalaga, with supervision by Dr. Anna Clemencia Guerrero (Santa Fe Institute), advised by Dr. Aaron K. Dinner (UChicago) and Dr. Julia Damerow (Arizona State University.

Fixed Issues

  1. Users reported three issues when setting up and running PicAxe-OCR: (a) running install_pcks.py was necessary to install layoutparser but this was not mentioned in the README, (b) setup-tools was missing at first run, and (c) running --bulk and --sample both failed with no error reported. The README and Docker image have been updated to mitigate these issues. We tested the pipeline again to make sure these issues were resolved, and there should be no further issues pulling the Docker image and running PicAxe-OCR.
  2. The Docker image tag for PicAxe-YOLO was originally called "tagname" as a placeholder, but the image tag has been updated to "latest".
  3. Before running PicAxe-YOLO with Docker, users need to create host folders. We have added instructions to the README for PicAxe-YOLO about where users need to create folders to (a) place their own input PDFs, (b) output the extraction results, and (c) store our pretrained YOLO weights.

Known Issues

  1. Extraction results will not be perfect from either pipeline. Users should always check the results of extraction before performing further data analysis. For more details about how we are working to improve extraction results, please see the main README file.
  2. Package dependencies can cause issues (noted in respective README files), so we have provided Docker files. If the Docker images are not pulled for some time, they will be deleted. Note that the Docker image might not exist at some point.

Notes

If you use, test, or refer to PicAxe, please cite it as below.

Files

acguerr1/PicAxe-v1.0.1.zip

Files (30.3 MB)

Name Size Download all
md5:fb12d792ea04b36e10e600fa4b63865f
30.3 MB Preview Download

Additional details

Related works

Is supplement to
Software: https://github.com/acguerr1/PicAxe/tree/v1.0.1 (URL)

Software