Published April 21, 2026 | Version v1
Report Open

Sustaining the Commons in the AI Economy: A Landscape Scan of Challenges and Strategies for Bridging AI Companies and Open Curated Collections

  • 1. ROR icon Invest in Open Infrastructure
  • 2. ROR icon San Jose State University

Description

This landscape scan was conducted by Invest in Open Infrastructure in the context of the Building Resilient Infrastructure through Dialogue, Growth, and Exchange project exploring how stakeholders in the knowledge ecosystem can work together as AI reshapes how collections are used and valued.


Curated collections form part of the digital commons: shared open resources maintained academic institutions, non-profit organizations, governments, and private companies. Sustained largely by public and private funding and an ethos of open knowledge sharing, these collections represent a public good inseparable from the health of open science and democratic access to information. That infrastructure is now under strain. AI labs bring insatiable demand for immediate access to high-quality training data. Automated bots now generate traffic that, in some cases, exceeds human visits, overwhelming servers and
inflating costs. AI-generated submissions threatens to overwhelm editorial workflows. These bots include large technology companies and a significant long tail of start-ups and individual experimenters.


In response, collections stewards have deployed bot-blocking tools, updated access policies, and pursued licensing strategies. Yet these responses have proven inadequate. Technical countermeasures are routinely circumvented. Licensing frameworks are nascent and poorly enforced. Legal protections are fragmented and oriented toward intellectual property rather than broader harms to the commons. Defensive access restrictions risk accelerating data consolidation, undermining the openness they were designed to protect. The report points toward a more promising path: commons-based governance grounded in reciprocal norms and shared interests. Data users have concrete reasons to want the commons to survive: loss of key data sources reduces training data quality; an increasingly walled-off web raises legal risks; and public frustration creates reputational pressure. Well-maintained curated collections offer unique, authoritative content that produces better models. Investment in the commons, properly framed, is investment in the quality of AI.


Yet a central tension remains unresolved. Existing frameworks rely on voluntary compliance, while evidence shows that voluntary frameworks have so far failed to change behaviour at scale. Whether enlightened self-interest will prove more effective than the
legal and technical mechanisms that have already fallen short remains an open question. Answering it requires engaging curators and consumers of open collections as stewards of the digital commons, co-creating partnership models that align open knowledge strategies with commercial demand.

Files

IOI_ Sustaining the Commons in the AI Economy_ A Landscape Scan of Challenges and Strategies for Bridging AI Companies and Open Curated Collections.pdf

Additional details

Funding

Andrew W. Mellon Foundation
Building Resilient Infrastructure through Dialogue, Growth, and Exchange (BRIDGE)

Dates

Available
2026-04-21