Evaluating and Regulating Agentic AI: A Study of Benchmarks, Metrics, and Regulation computing and processing

Raza, Shaina

doi:10.36227/techrxiv.176186841.18883348/v2

Published November 11, 2025 | Version v1

Preprint Open

Evaluating and Regulating Agentic AI: A Study of Benchmarks, Metrics, and Regulation computing and processing

Raza, Shaina

Agentic AI represents a new generation of Artificial Intelligence (AI) systems capable of perceiving, reasoning, planning, and acting toward achieving goals with a degree of autonomy. Unlike traditional AI models that merely generate outputs, these systems maintain memory, interact with their environment, and adapt over time. However, evaluating such interactive and evolving behavior remains a significant challenge. While several recent surveys have examined agentic AI architectures, components, and applications, few have systematically reviewed their evaluation, particularly regarding performance, reliability, and governance across an evolving agentic AI ecosystem. This paper addresses that gap by reviewing recent progress in the development and assessment of agentic AI, focusing on three core dimensions: benchmarks, metrics, and governance. We analyze how current evaluation frameworks capture reasoning, planning, collaboration, and ethical alignment across single-and multi-agent systems. Ultimately, this study aims to establish a unified foundation for building trustworthy, auditable, and human-aligned AI agents. The project webpage is available at project link.

Files

1350845.pdf

Files (3.2 MB)

Name	Size	Download all
1350845.pdf md5:b2a5e5eb074a4579d235d3f1aaaad0fd	3.2 MB	Preview Download

Additional details

European Commission
AIXPERT - An agentic, multi-layer, GenAI-powered backbone to make an AI system explainable, accountable, and transparent 101214389

	All versions	This version
Views	37	37
Downloads	34	34
Data volume	137.9 MB	137.9 MB

Evaluating and Regulating Agentic AI: A Study of Benchmarks, Metrics, and Regulation computing and processing

Authors/Creators

Description

Files

1350845.pdf

Files (3.2 MB)

Additional details

Funding