-----------------------------------------------------
README
-----------------------------------------------------
By Ellie Bennett, University of Helsinki

In this folder you can find the files used in the analysis of the chapter 'Tunteiden kehollisuus uusassyrialaisessa imperiumissa: digitaaliset ihmistieteet muinaisen Lähi-idän tutkimuksen apuna' (henceforth Tunteiden).

TEXT CORPORA
'Oracc2022_clean_30042024.txt' is the master text corpus. It includes everything on the Open Richly Annotated Cuneiform Corpus (Oracc) that was tagged as 'Akkadian' and from the Neo-Assyrian period. The data was originally downloaded in 2022. There are 7,969 texts, 1,014,890 tokens (i.e. words), and 19,436 unique words. Every word is visualised as lemma[sense]POS according to Oracc annotations, and every line represents a text. Word with no lemmatisation or annotations are visualised as '_'.

In the subfolder 'Data according to king', the master corpus was divided into 38 sub-corpora based on the Oracc tag designating the reign of the king during which the text was written. There is an additional Excel sheet that was used to help divide the sub-corpora.

In the subfolder 'Data broken down by genre', the master corpus was divided into 11 sub-corpora based on the texts' genre. The original Oracc tags were aggregated into wider categories Oracc tag, which is outlined in the Excel file 'Genre Tags'. 

ANALYSIS
The tools for analysis was a Python script called 'Word counts.py'. It is a simple script used to count the attestations of words for the knee, liver, and love in a single text file.

The script was run each time for each sub-corpus in the folders 'Data according to king' and 'Data broken down by genre'. The results were recorded in the Excel spreadsheet 'Love_stats'. The spreadsheet includes its own ReadMe on the first sheet.

IMAGES
'Figure 1' is a graph showing the attestations of kabattu ('liver') and râmu ('to love') in the corpus according to genre.

CC-BY-4.0 Ellie Bennett.