Neutralization of Evaluative Expressions Based on Dictionary Data and Distributional Models

Mitrofanova, Olga; Vybornaya, Veronica

doi:10.5281/zenodo.10426342

Published November 17, 2023 | Version v1

Conference paper Open

Neutralization of Evaluative Expressions Based on Dictionary Data and Distributional Models

1. St.-Petersburg State University
2. Saint Petersburg State University

Text style transfer (TST) is an important task in natural language generation, which aims to change the stylistic properties of the text while preserving the style-independent content. With the success of deep learning algorithms in the last decade, a variety of neural networks have been recently proposed for TST. If parallel data is provided, sequence-to-sequence models are usually used. However, most of the use cases do not have parallel data. Thus, this paper presents three non-parallel dataset methods for automatic identification and replacement of obscene evaluative expressions in a text, one being based on an internet dictionary Wiktionary, and two based on transformer models (BERT, GPT2). The models are then evaluated manually and automatically on a toxic dataset, extracted from a popular Russian social network VKontakte (VK). Experimental results demonstrate that the transformer-based (BERT) method has the highest average score (0.86) among style-strength and content preservation metrics.

Files

Vyb.pdf

Files (846.3 kB)

Name	Size	Download all
Vyb.pdf md5:c91c070221652d6648c1cc033d97e670	846.3 kB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	46	46
Downloads	30	30
Data volume	27.9 MB	27.9 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

FRUCT Oy

Published in

Proceedings of the 34th FRUCT conference, 34, 296-https://youtu.be/-QHugeFPKRc, 2023.

Imprint

ISBN: 978-952-65246-0-3.

Conference

The 34th IEEE Conference of Open Innovations Association FRUCT (FRUCT34) , Riga, Latvia, 15-17 November 2023

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: December 22, 2023
Modified: July 7, 2024

Neutralization of Evaluative Expressions Based on Dictionary Data and Distributional Models

Authors/Creators

Description

Files

Vyb.pdf

Files (846.3 kB)