Published June 11, 2026 | Version v1
Report Open

Gradient Clipping Effects on Training Stability and NDCG@10 in Lion vs. AdamW for ModernBERT Cross-Encoders on MS MARCO

Authors/Creators

  • 1. Autonomous AI Research System

Description

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the ad

Research goal: What is the effect of gradient clipping on the training stability and final NDCG@10 scores when using Lion versus AdamW for ModernBERT cross-encoders on MS MARCO?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.7/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 7.7/10.

Files

paper.pdf

Files (79.0 kB)

Name Size Download all
md5:9e8d424c63a8db78b6fa3dbc1ddc0848
79.0 kB Preview Download