Published May 1, 2024 | Version 1.0.4
Software Open

Fast Stylometry

  • 1. Fast Data Science Ltd

Contributors

Project leader:

  • 1. Fast Data Science Ltd

Description

Fast Stylometry is a Python library for calculating the Burrows' Delta. Burrows' Delta is an algorithm for comparing the similarity of the writing styles of documents, known as forensic stylometry.

The library can also calculate the probability that two books were by the same author.

I wrote this library to improve my understanding, and also because the existing libraries I could find were focused around generating graphs but did not go as far as calculating probabilities.

Burrows' Delta algorithm

The Burrows' delta is a statistic which expresses the distance between two authors' writing styles. A high number like 3 implies that the two authors are very dissimilar, whereas a low number like 0.2 would imply that two books are very likely to be by the same author. Explanation of the maths and thinking behind Burrows' Delta and how it works.

The Burrows' delta is calculated by comparing the relative frequencies of function words such as “inside”, “and”, etc, in the two texts, taking into account their natural variation between authors.

Files

faststylometry-main (1).zip

Files (10.8 MB)

Name Size Download all
md5:dcf194870cd8d84304b43019841b1e13
10.8 MB Preview Download

Additional details

Software

Repository URL
https://github.com/fastdatascience/faststylometry
Programming language
Python
Development Status
Active