Published September 30, 2025 | Version v1.0.0
Dataset Open

Totally Unimodular Node–Arc Incidence Matrices: Medium Collection B (20,000 - 50,000 nodes)

  • 1. EDMO icon Royal Holloway, University of London

Description

This dataset contains a collection of totally unimodular (TU) node–arc incidence matrices, generated from random directed graphs with node counts between 20,000 and 50,000. Each column of the incidence matrix has exactly one +1 (arc tail) and one –1 (arc head). Because of the TU property, all linear programming relaxations of integer flow problems are guaranteed to have integer solutions.

The dataset includes:

  • matrices.csv: sparse representation of all instances (two rows per arc, +1 and –1).

  • metadata.csv: summary of each instance (nodes, arcs, density).

  • Conversion scripts (make_dat_all.py, make_dat_all.R) to produce .dat files for AMPL or other solvers.

Typical applications include benchmarking large-scale network flow and minimum-cost flow solvers, and studying algorithmic scalability.

⚠️ Large file notice:

The CSV files in this collection are very large (around 20 GB). They cannot usually be opened directly in spreadsheet software or loaded fully into memory on a typical laptop. For analysis, we recommend:

  • Chunked reading (e.g. pandas.read_csv(..., chunksize=...)),
  • Out-of-core frameworks such as Dask or Polars, or

  • Importing into a database (e.g. PostgreSQL, SQLite).

For smaller and more manageable datasets, please see the Small and Medium A collections.

Files

README.md

Files (19.4 GB)

Name Size Download all
md5:711fb2f67fc3d57522e740ecba7bbb97
3.7 kB Download
md5:1bf12712fda3d9a2a7bc382374e349b3
2.4 kB Download
md5:5f2cce496ed1432c151420e46d84e1f3
19.4 GB Preview Download
md5:7560b32bfad60c43f446ee007f31b015
2.8 kB Preview Download
md5:399b7986e36cd418b3a1d1baacda3853
2.6 kB Preview Download