Scaling Parameter Effects in Contrastive Pre-training and Alignment Fine-tuning for Zero-Shot Dense Encoder Retrieval
Description
Dense retrievers utilize pre-trained backbone language models (e.g., BERT, LLaMA) that are fine-tuned via contrastive learning to perform the task of encoding text into sense representations that can be then compared via a shallow similarity operation, e.g. inner product. Recent research has questioned the role of fine-tuning vs. that of pre-training within dense retrievers, specifically arguing that retrieval knowledge is primarily gained during pre-training, meaning knowledge not acquired during pre-training cannot be sub-sequentially acquired via fine-tuning. We revisit this idea here as th
Research goal: What is the comparative impact of scaling model parameters during contrastive pre-training versus alignment fine-tuning on the zero-shot retrieval performance of dense encoders across heterogeneous domains like PubMedQA and ScienceQA?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.7/10.
Notes
Files
paper.pdf
Files
(75.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:cc25952141d02277ef008a306a8ac08a
|
75.2 kB | Preview Download |