Published January 8, 2026 | Version v1
Publication Open

Reliability-Aware and Explainability-Driven Evaluation of Graph Neural Networks on Citation Networks

  • 1. ROR icon Lincoln University College

Description

While Graph Neural Networks (GNNs) are widely

used for citation network analysis, their reliability and interpretability remain understudied. Existing benchmarks predominantly focus on accuracy, overlooking confidence behavior and

failure modes critical for real-world deployment. To address this,

we introduce a reliability-aware evaluation framework for GNNs

that holistically compares three prominent architectures—GCN,

GAT, and GraphSAGE—on node classification and link prediction across Cora, Citeseer, and PubMed. Beyond accuracy, we

integrate GNNExplainer to investigate model interpretability and

uncover a critical overconfidence phenomenon: incorrect predictions show 22.1% higher confidence than correct ones, indicating

severe miscalibration. Our ablation studies reveal that edgebased augmentation outperforms feature-based augmentation by

+1.2% accuracy, and that single-head GAT performs comparably

to multi-head on homophilic graphs, suggesting architectural

redundancy. Statistical analysis confirms GAT’s superior performance (82.7% accuracy) and calibration, but all models exhibit

strong reliance on edge importance (r=0.82 with confidence).

These findings motivate the necessity of calibration-aware GNN

evaluation and post-hoc correction techniques, offering actionable

insights for architecture selection and trustworthy deployment of

GNNs in scholarly applications. 

Files

Main.pdf

Files (204.7 kB)

Name Size Download all
md5:11e15367f97edc73447b918f94e604f7
204.7 kB Preview Download

Additional details

Software

Repository URL
https://github.com/sanu123-mj/gnn-graph-ai
Programming language
Python
Development Status
Active