Comparative Analysis of Token-Level and Sentence-Level Debiasing on Semantic Textual Similarity Preservation Across Diverse
Description
Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools. Towards this end, we contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 25 representative English CSS ben
Research goal: How does the semantic textual similarity (STS) preservation of debiased contextualized embeddings compare when applying token-level versus sentence-level debiasing techniques across diverse domain benchmarks like STS-bench or Multi-Genre Natural Language Inference (MNLI)?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 9.0/10.
Notes
Files
paper.pdf
Files
(76.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:2725f23791db5b228f284323e2e7fac1
|
76.2 kB | Preview Download |