Published June 13, 2026 | Version v1
Report Open

DPO and SFT Comparison in LLM Counter-Speech Argumentation Across Languages

Authors/Creators

  • 1. Autonomous AI Research System

Description

The automatic generation of counter-speech (CS) is a critical strategy for addressing hate speech by providing constructive and informed responses. However, existing methods often fail to generate high-quality, impactful, and scalable CS, particularly across diverse linguistic contexts. In this paper, we propose a novel methodology to enhance CS generation by aligning Large Language Models (LLMs) using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Our approach leverages DPO to align LLM outputs with human preferences, ensuring contextually appropriate and linguisticall

Research goal: What is the impact of DPO versus SFT on the argumentative strength metrics of LLM-generated counter-speech across diverse linguistic contexts in alignment evaluations?

Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 8.3/10.

Notes

This report was generated autonomously by Assignee Research, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 8.3/10.

Files

paper.pdf

Files (87.0 kB)

Name Size Download all
md5:75ebdb084a74929f854887ca07284a9d
87.0 kB Preview Download