Dissecting BERT Layers: FFN Dual Role, Separability-Guided Layer Skip, and Interpretable Classification via Charge-Flow Learning

Cynn, Yeonseong

doi:10.5281/zenodo.20032756

Published May 5, 2026 | Version v1

Preprint Open

Dissecting BERT Layers: FFN Dual Role, Separability-Guided Layer Skip, and Interpretable Classification via Charge-Flow Learning

Cynn, Yeonseong

We present a layer-level analysis framework for BERT across five GLUE tasks. Using RX(River XAI), a charge-flow based interpretable learning framework, we replace BERT’s classifier with a 2–16 node interpretable network and identify removable layers through separability analysis. Our key contributions are: (1) a separability-guided layer skip method validated by actual BERT forward-pass experiments on all five tasks, (2) quantitative decomposition of FFN’s dual role — 92% structural (norm normalization) vs. 8% classification-relevant — explaining why FFN removal causes model collapse while individual layers appear “harmful” to classification, and (3) error analysis revealing that 60–93% of misclassifications are high-confidence errors (margin > 0.3), indicating BERT’s CLS representation itself is the bottleneck. RX is one application of a broader proprietary learning framework developed at River Lab; method specifics are subject to intellectual property protection.

Files

river_7 (2).pdf

Files (295.4 kB)

Name	Size	Download all
river_7 (2).pdf md5:53de7e3755af7d2e4edbba3d4237fb6d	295.4 kB	Preview Download

	All versions	This version
Views	10	10
Downloads	6	6
Data volume	1.8 MB	1.8 MB

Dissecting BERT Layers: FFN Dual Role, Separability-Guided Layer Skip, and Interpretable Classification via Charge-Flow Learning

Authors/Creators

Description

Files

river_7 (2).pdf

Files (295.4 kB)