Comparative Analysis of Token-Level and Sentence-Level Debiasing on Semantic Textual Similarity Preservation Across Diverse

SOVEREIGN Research Kernel

doi:10.5281/zenodo.20636244

Published June 11, 2026 | Version v1

Report Open

Comparative Analysis of Token-Level and Sentence-Level Debiasing on Semantic Textual Similarity Preservation Across Diverse

SOVEREIGN Research Kernel¹

1. Autonomous AI Research System

Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools. Towards this end, we contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 25 representative English CSS ben

Research goal: How does the semantic textual similarity (STS) preservation of debiased contextualized embeddings compare when applying token-level versus sentence-level debiasing techniques across diverse domain benchmarks like STS-bench or Multi-Genre Natural Language Inference (MNLI)?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 9.0/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 9.0/10.

Files

paper.pdf

Files (76.2 kB)

Name	Size	Download all
paper.pdf md5:2725f23791db5b228f284323e2e7fac1	76.2 kB	Preview Download

	All versions	This version
Views	1	1
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Comparative Analysis of Token-Level and Sentence-Level Debiasing on Semantic Textual Similarity Preservation Across Diverse

Authors/Creators

Description

Notes

Files

paper.pdf

Files (76.2 kB)