Published April 2026 | Version v2
Preprint Open

Beyond AUM: The Acoustics of the Fourth Element

Authors/Creators

Description

The Vedic syllable AUM is traditionally analysed as three phonemes (A, U, M), yet Indian philosophy has long described a fourth element beyond these audible segments. This paper reports a spectrographic observation of what occurs during the sustained nasal /m/ when chanted with deliberate articulatory awareness.

Using VoceVista Video and Praat analysis of sustained vocalisations at four fundamental frequencies (f0 ≈ 54 Hz creaky voice, 105 Hz inert control, 108 Hz modal phonation, and 136 Hz modal phonation), I document a continuous reorganisation of the upper formant structure during the nasal /m/: F2 rises from ~700 Hz to ~2500 Hz, F3 rises from ~2500 Hz to ~3700 Hz, and the nasal anti-resonance (a travelling band of spectral silence) migrates from ~500 Hz to ~3000 Hz. The rising F2 and the rising anti-resonance converge at ~1900 Hz, where they encounter a steady peak (N3, the third resonance of the pharyngo-nasal tube), producing brief deep attenuation before F2 recovers. The trajectory is reproduced across all three living fundamentals (54, 108, 136 Hz); the N3 anchor remains within a narrow frequency band regardless of f0. A pitch-matched comparison between a living /m/ at 108 Hz and an inert /m/ at 105 Hz, from the same performer in the same recording session, demonstrates that the acoustic contrast between the two variants is attributable to articulatory behavior, specifically deliberate tongue advancement, and not to pitch-related or register-related differences.

I distinguish this 'living' /m/ from an 'inert' variant with stationary spectral features, and propose the living trajectory as the acoustic correlate of the graded sonic dissolution described in the nāda-yoga strand of the Indian philosophical tradition: the same continuous sonic progression named by the tradition from the inside, documented here from the outside. The acoustic contrast between living and inert /m/ aligns with the traditional distinction between 'struck' (āhata) and 'unstruck' (anāhata) sound, and with the fourth element long described in the Upaniṣadic and Yoga traditions.

Notes

Version 2 — April 2026
Substantially revised and expanded version of the April 2026 preprint.

Extended dataset and controls:

  • Added three new recording conditions: living /m/ at f0 ≈ 54 Hz (creaky voice), living /m/ at f0 ≈ 136 Hz (chest voice), and an inert /m/ at f0 ≈ 105 Hz (pitch-matched control). Version 1 reported a single condition (f0 ≈ 110 Hz, chest voice).

    The principal reference condition (f0 ≈ 110 Hz in v1) is retained in v2 as the living /m/ at f0 ≈ 108 Hz; the small difference reflects more precise measurement of the same recording.

  • The pitch-matched living/inert comparison (108 Hz vs. 105 Hz, same performer, same session) is new and constitutes the principal empirical control of this version.

  • The N3 anchor, revised from ≈1800 Hz (v1) to ≈1900 Hz, is now corroborated across all three living fundamentals.

Added analysis:

  • Praat Burg LPC formant tracking (6 formants, 5000 Hz ceiling, 25 ms window) added to cross-validate VoceVista visual estimates; manual F2 reconstruction through the pole-zero crossing region documented.

  • New §3.3 (Automatic formant tracking validation) with three annotated Praat figures (Figure 4).

  • New subsection documenting qualitative features of the trajectory (progressive ascent, slow uccāra-paced timescale, dense resonance at the crossing zone).

Methodological detail:

  • Recording protocol expanded: tongue endpoint now described precisely as broad anterior/middle contact with the hard palate, approximating the myofunctional rest position.

Interpretive revision:

  • The interpretive framework is substantially strengthened. Version 1 posed the analogy as an open question ("heuristic parallel," "possible correlate"). Version 2 proposes the living trajectory explicitly as the acoustic correlate of the graded sonic dissolution in the nāda-yoga strand, distinguishing it from the Māṇḍūkya's soundless amātra.

  • §1.1 (What This Paper Does Not Claim) rewritten to separate the two strands of the tradition and clarify the scope of the claim.

  • §4.2 renamed from "The Trajectory as the Fourth Element" to "The Acoustic Correlate of the Nāda-Yoga Dissolution" and substantially revised.

  • Falsifiability framed at two distinct levels: acoustic (multi-speaker replication) and neurophysiological (simultaneous EEG/MEG protocol).

  • §4.3 expanded with a new observation about the role of inter-repetition silence in repetitive AUM chanting.

New references added:

  • Saus, Seither-Preisler & Schneider (2025) — MEG evidence for theta-dominant processing of overtone-rich stimuli.

  • Bergevin et al. (2020) — overtone focusing in Tuvan throat singing.

  • Dang, Honda & Suzuki (1994) — nasal and paranasal cavity morphology.

  • Kitamura et al. (2016) — paranasal sinus contribution to nasal tract transfer function.

  • Reznikoff (2004/2005) — phenomenological A-O-U-M ascent through the body.

  • Gerety (2015, 2021) added alongside the already-cited 2025 interview.

Structural changes:

  • Figure order revised: Figure 1 now leads with the inert/living contrast; Figures 2–3 show living /m/ at 54 Hz and 136 Hz; Figure 4 is new (Praat LPC tracking).

  • §3.1 (formerly covering the /ɑ/→/u/ vowel segment) removed; paper now opens directly with the inert/living contrast as the principal finding.

  • AI assistance disclosure and data availability statement added.

Files

Beyond_AUM.pdf

Files (3.7 MB)

Name Size Download all
md5:8e55e1a53014717bc08003920c834df3
3.7 MB Preview Download

Additional details

Related works

Is supplement to
Preprint: 10.5281/zenodo.19115650 (DOI)
Preprint: 10.5281/zenodo.18861836 (DOI)
Preprint: 10.5281/zenodo.19438733 (DOI)