Working paper Open Access

Computational Approach to Bengali Stress

Bhattasali, Shohini

In this work, my goal is to train a computational model to detect stress in Bengali using data from a speech corpus and then compare my results against existing accounts of Bengali stress which differ in their analyses.

Stress refers to the relative prominence of portions of an utterance (Liberman and Prince 1977). It has also been defined as the linguistic manifestation of rhythmic structure (Liberman 1975, Liberman and Prince 1977). Hayes (1995) explains this further and states that in stress languages, “every utterance has a rhythmic structure that serves as a framework for that utterance’s phonological and phonetic realization” (8). However, any formal theory of stress has to account for considerable cross-linguistic variation and the different acoustic correlates of stress such as duration and intensity.

Hayes (1980) proposed that stress patterns can be classified into two different types: quantity-insensitive stress systems and quantity-sensitive stress systems. Quantity-insensitive systems are those in which syllable weight is not relevant in conditioning stress placement. Gordon (2011) gives the example of the Australian language Maranungku as an example of a quantity-insensitive stress system. In this language, the primary stress falls on the first syllable of a word and secondary stress docks on the remaining odd-numbered syllables, as seen in examples (1 – 4) from Tryon 1970.

(1) "tiralk ‘saliva’

(2) "mæræ­pæt ‘beard’

(3) "jaNar­mata ‘the Pleiades’

(4) "Nalti­riti­ti ‘tongue’

Conversely, in quantity-sensitive stress systems, stress is sensitive to syllable weight. Yana (Sapir and Swadesh 1960) is an example of one such language. In this language, stress falls on the leftmost heavy syllable (CVV or CVC), otherwise the initial syllable receives stress, as seen in examples (5) – (8).

(5) "p’udiwi ‘women’

(6) si"bumk’ai ‘sandstone’

(7) su"k’o:niya: ‘name of Indian tribe’

(8) tsini"ja: ‘no’

In this work I am focusing on Bengali, which has been classified as both a quantity insensitive system by Hayes and Lahiri (1991) and as a quantity-sensitive system by Shaw (1984). There is no agreement on the stress pattern in Bengali but all studies agree that stress in Bengali is predictable (Hayes and Lahiri 1991, Shaw 1984, Das 2001).

The paper is organized as follows:

in Section 2, I provide a brief overview about Bengali and the existing accounts of stress in Bengali and

in Section 3 I explain the main objective behind this study. N

Next, I discuss relevant background information about my approach, the speech corpus, toolkits and give an overview of how my model is trained to detect stress cues in Bengali in Section 4.

Lastly, I present my results in Section 5 and end with a discussion of the results in Section 6.

This working paper is copyrighted, and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) - see
Files (1.6 MB)
Name Size
1.6 MB Download
All versions This version
Views 269269
Downloads 2020
Data volume 31.7 MB31.7 MB
Unique views 200200
Unique downloads 2020


Cite as