Acoustical Comparison between /u/ and /u:/ Arabic Vowels for Non-Native Speakers

Received Jan 10, 2018 Revised Mar 19, 2018 Accepted Apr 11, 2018 The articulation of Arabic phonemes is essential for the Malay community since Arabic language is mandatory to perform worship. Hence, in this paper, an acoustical analysis of Arabic phonemes for vowels /u/ and /u:/ is discussed based on tokens pronounced by Malay speakers. The experimental results showed that the Malay speakers are inclined to utter these Arabic phonemes similar to the native speakers and it was also found from the analysis that the vowel /u/ and /u: was articulated as high-back vowels. Conversely, the vowel /u/ was located lower than /u:/ as in the vowel-space. Alternatively results also showed that /u/ and /u:/ is higher than the other vowels specifically /a/ and /a:/. In addition, the statistical analysis showed that the formant frequencies of both short and long dummah for formant frequency F1, F2 and F3 have more variation in terms of /u/ as compare to /u:/. In contrast formant frequency F4 and F5 are more diversity in terms of /u:/.


INTRODUCTION
Modern Standard Arabic (MSA) has 36 phonemes, of which are separated into three classes specifically six phonemes are vowels, two phonemes are diphthongs and 28 phonemes are consonants. Besides the two diphthongs phonemes, there are three short vowels namely /a, i, u/ and one more additional three long vowels specifically /a:,i:, u:/ [1]. However, some researchers did consider the two diphthongs phonemes as vowels that makes the total number of vowels as eight [2]. If it is to compare Arabic with English language, the number of vowels is still less. As mentioned earlier, the Semitic languages and Arabic comprise of two categories; that are pharyngeal and emphatic sounds [2]& [3]. Based on numerous studies, the phonetic difference between the emphatic phonemes and the non-emphatic is during articulation. In addition, as reported by Laufer & Baer, their research findings have proven that the emphatic and pharyngeal consonants shared similar articulation in the pharynx [4].
Alternatively, Arabic dialects have additional numbers of vowels for instance the Levantine dialect has two additional diphthongs which are /aj, aw/ along with the Egyptian dialect with several vowels as well [5]. As we know, across the Arab world, there are diverse spoken dialects amongst them. These dialects have a wide range of variances between each other, which makes communication challenging even among them that speak different dialects and it is more common among illiterate speakers. Research on vowels in various Arabic dialects investigated by [6] proven that Arabic vowels pronounced by speakers of different Arabic dialects including Saudi, Sudanese and Egyptian dialects has a different formant for each vowel. Conversely, a recognition system is developed by [7] to recognise Arabic vowels using wavelet average framing linear prediction coding, which is a modified version of the linear prediction coding and the probabilistic neural network. Another Arabic vowel recognition system is suggested by [8] based on facial electromyograph signals (EMG) in order to use it for computer interface and subjects with speech weakness. The features and the possibility of using Arabic vowels in clinical application in communication disorders was investigated by [9] to help evaluate the language, speech and voice disorders in young and adult subjects. More recent research on Arabic language are conducted to address the vowel systems of Colloquial Arabic language as a quantity language with other Arabic dialects such as Syrian and Lebanese along with other quantity languages such as Spanish was reported in [10].
Research and studies of the acoustic characteristics of the Palestinian Arabic vowels was done by [11]. Here, six native Palestinian Arabic utterers have articulated 1368 tokens. Experimental analysis and results exhibited that F1 for short /i/ and short /u/ has higher frequency than its counterpart vowel and this refers to the high long vowels that are produced during higher tongue position. However, the short low vowel /a/ has lower F1 frequency as compared to its long counterpart vowel. Meanwhile, for F2 frequencies, the speakers were likely to articulate the short /i/ with lower value than the long /i/ in contrast with the short /u/. As reported in [13], study showed that the Palestinian vowels have lower F1 and F2 values as compared to Iraqi [12] and Tunisians vowels [13]. Additional research on the Palestinian Arabic vowels as discussed in [14] mentioned that the variation in the vowel durations in two cases: normal speakers and speakers with Broca's aphasia and reported that speakers with Broca's aphasia have longer vowel durations as compared to normal speakers. Furthermore, researchers have also focused to study the vowels in other Arabic dialects. For example, Saudi, Sudanese and Egyptian Arabic vowels as addressed by [15] is to investigate whether vowels in MSA is realized in the same way when spoken by individuals related to different dialects. The study showed that the short vowels were likely to be centralized more than the long vowels. Another research on vowels of eight Arabic dialects is conducted by [16] that include Lebanese, Syrian, Qatari, Tunisian, Emirati, Jordanian, Saudi and Sudanese dialects. In this study, monosyllabic words were used in the experiments. This study has shown that there is a significant difference in the formant values among all eight dialects. On the other hand, another study of vowels in the Libyan Arabic is addressed by [17] to investigate acoustic and auditory descriptions about vowels in the Libyan Arabic as comparison with vowel's attributes by other Arabic dialects. The use of monosyllabic words was recorded among 20 native Libyan Arabic native speakers. Results showed that the long and short vowels were significantly varied in both quantity and quality whilst short vowels are more centralized as compared to other results reported by other researches. Another research is on formant based analysis of spoken Arabic vowels as studied by [18]. The first two formants were considered in addition to the differences and similarities between vowels. All the carrier words were formed using Consonant-Vowel-Consonant style (CVC).

METHODOLOGY
A corpus consists of spoken Arabic phonemes is used in this study. As stated earlier all speakers in this corpus are Malays. The total number of subjects are eight with six males and two females. The speakers age are between 18 to 38 years old. Each speaker is required to articulate 28 vowel syllables and these syllables represent the main six vowels in the Arabic language. The total number of articulated tokens are 224 and each speaker is given one session to utter all the syllables. All speakers are given time to pronounce the syllables prior to the recording process. The recording process is done via SAMSON C03U USB multipattern condenser microphone. This microphone has the ability to record high-quality voice even in a noisy environment due to its built-in switchable high-pass filter and 10 dB pad. The chosen sampling rate for all the recording phonemes is 16000 Hz, 32 bits as sample format and mono channel. Audacity 2.0.3 software is used as a platform for recording. The approximate location for Arabic vowels is illustrated in Figure 1 by [10]. As depicted in Figure 1, the long vowels by /a, u, i/ and their short counterpart are represented by /ӕ, Ʊ, I/ respectively.

EXPERIMENTAL ANALYSIS AND DISCUSSION
This section will discussed in detail the acoustical analysis of short and long dummah. Firstly, the location of the vowel in the vowel-space for each vowel is analysed. Next, the measurement of formant frequencies namely F1, F2, F3, F4 and F5 of each vowel is discussed in accordance to every phoneme. Statistical analysis is indeed vital for evaluation and validation of the result attained as reported in [19], [20] & [21]. Hence in study, statistical measurements are performed to determine the variations between the vowels.  Figure 2 depicted the distribution of all short dummah of 28 Arabic alphabets namely categorised as high-back vowel. As shown in Figure 2, Malays speakers tend to pronounce this vowel as a high-back vowel too as most of the vowels are located near to the minimum values of formant frequencies of F1 and F2 as highlighted in the circle. In addition, there is more variability along F2 in contrast with F1 but this vowel showed less variation as compared to the short fatha.  Table 1 showed the formant frequencies of F1, F2, F3, F4 and F5 for short dummah vowels /u/. From Table 1, it is observed that the vowel frequencies of F3 are more disparity as compared to other frequencies since most of the phonemes exceed the 2000Hz. The phonemes /ðˤ u/ has the highest value of F2, F3 and F4 frequencies specifically 2443.1, 3454.1 and 4137.4, respectively whilst for F5, the token /du/ has the highest frequency that is 5093.7. Further, as shown in Figure 3, all the frequencies are clearly distinct from one another since there is no overlapping between them. As for F1, most of the vowels are within 1000 Hz except for /su/ and /ʃu/ vowels.     Moreover, the statistical results as tabulated in Table 2 showed that F2 has the largest range that resembled that the short dummah vowels are stretch along the F2 axis whilst F1 has the smallest range that indicated that these vowels have less distribution over F1 axis. As for the mean value, F1 has the smallest value that indicated that the short dummah vowels are position at the back in the vowel space and this is confirmed as shown earlier in Figure 2.

Short Dummah /u/
By comparing short dummah vowels /u/ to their counterpart the short fatha vowels /a/, it can be noticed that the mean value of F1 for short dummah vowels is smaller than the mean value of F1 for short fatha vowels which indicated that short dummah vowels are lower than short fatha vowels in the vowel space. The mean values of F1 are 520 Hz and 736 Hz for short dummah and short fatha respectively. Further, Figure 4 showed the difference between short dummah vowels and short fatha vowels. Short dummah vowels are indicated by the dotted rectangle and short fatha vowels are indicated by the dotted circle.

Long Dummah /u:/
The long Dummah vowel is considered a high back as shown earlier in Figure 1. Long dummah has less difference as compared to the short dummah vowels /u/ with regards to its distribution in the vowel space as depicted in Figure 5 and earlier in Figure 2, apart from the fact that short dummah vowels /u/ tend to have higher F1 and F2. Note that F1 and F2 range for short dummah vowels /u/ are 786.4 Hz and 1521 Hz respectively whilst F1 and F2 range for long dummah vowels /u:/ are 555.9 Hz and 1147.8 Hz respectively. These values suggested that long dummah vowels are more front and lower in the vowel space. Next, Figure 6 illustrated the dissimilar between short and long dummah vowels. This is because long                             dummah vowels tend to have a wide variation along F2 and similar for short dummah vowels with the same variation. Further, Figure 7 showed comparison between long dummah vowels and long fatha vowels. As observed in Figure 7, a major difference between these two vowels with long dummah vowels have the tendency to hold high back position while long fatha vowels tend to centralise in the vowel space.  Table 3 that showed the F1, F2, F3, F4 and F5 frequencies for long dummah vowels /u:/. From Table 3, frequency F5 is the most stable among the formants with all tokens have frequencies less than 5000 Hz. This justified that the range of F5 is smaller than the range of F5 in case of vowel /u/ as presented in Table 4. In contrast, F3 has more token that exceeded 2000 Hz. Further, Figure 8 depicted the frequencies that showed all frequencies are distinguished except for a slight overlap between F3 and F4.      Further, Table 4 showed the statistical analysis for the 28 long dummah vowels. As compared to results tabulated in Table 2, it was observed that there are differences between the acoustic realizations of /u/ and /u:/. The comparison showed that short dummah vowels have bigger range for F1 and F2 which means that short dummah vowels are distributed in the vowel space more than the long vowels. The mean values of F1 and F2 for the long dummah vowels are less than their counterparts in the short dummah vowels and this indicated that the long dummah vowels are more high back vowels.

CONCLUSION
In conclusion, this study discussed the acoustical attributes of both short and long dummah vowels in Arabic language with all tokens articulated by Malay speakers with the purpose is to investigate the articulation by Malay speakers as compared to Arabic native speakers. Results attained showed that the speakers have a tendency to articulate the short dummah /u/ as a high-back vowel and it is higher as compared to short fatha /a/. In contrast, the phonemes based on vowels /u/ and /u:/ have shown similarity in terms of the location in the vowel space. However, the long dummah for vowel /u:/ are still higher than the vowel /u/ in the vowel space.