There is a newer version of the record available.

Published February 14, 2024 | Version v1
Conference proceeding Open

Plagiarism Detection Using Keystroke Logs

Description

This study examines the potential to use keystroke logs to examine differences between authentic writing and transcribed writing. Transcribed writing produced within writing platforms where copy and paste functions are disabled indicates that students are likely coping texts from the internet or from generative artificial intelligence (AI) models. Transcribed texts should differ from authentic texts where writers follow a process that includes monitoring, evaluating, and revising texts. This study develops a transcription detection model by using keystroke logs within a machine learning model to predict whether an essay is authentic or transcribed. Results indicated that keystroke logs accurately predicted whether an essay was written transcribed with 99% accuracy using a random forest model. Authentic writing included a greater number of pauses before sentences and words, had a greater number of insertions and longer insertions, deleted more words and character, and had a greater number of revision than transcribed writing. Transcribers, on the other hand, produced a greater number of writing bursts because they were simply copying language. Overall, the results indicated that authentic writing is a dynamic process where writers monitor their writing and evaluate whether the writing needs to be changed if problems are identified. Transcribed writing, on the other hand is much more linear. The results have important implications for plagiarism detection.

Files

ai_detection_final_not_anon.pdf

Files (277.3 kB)

Name Size Download all
md5:f2413fee6e01990bbbdc0cd7f1478cd8
277.3 kB Preview Download