Online film subtitles as a corpus: An ngram-based approach
Creators
Description
This paper investigates online film subtitles as a separate register of communication from a quantitative perspective. Subtitles from films in English and other languages translated into English are compared with registers of spoken and written communication represented by large corpora of British and American English. A series of quantitative analyses based of n-gram frequencies demonstrate that subtitles are not fundamentally different from other registers of English and that they represent a close approximation of British and American informal conversations. However, it is shown that the subtitles are different from the conversations with regard to several functional characteristics, which are typical of the language of scripted dialogues in films and TV series in general. Namely, the language of subtitles is more emotional and dynamic, but less spontaneous, vague and narrative than that of normally occurring conversations. The paper also compares subtitles in original English and subtitles translated from other languages and detects variation that can be explained by differences in communicative styles.
Files
Levshina_SubtitlesAsCorpus_revised.pdf
Files
(740.7 kB)
Name | Size | Download all |
---|---|---|
md5:0aadb7bf05ee866a99ab2ccba20467ff
|
740.7 kB | Preview Download |