Thesis Open Access
The aim of this thesis work is the transformation of timing, time duration and fundamental frequency of audio objects like single notes or a melody line of one instrument within a polyphonic audio environment. The research is limited to harmonic audio objects which are comprised of a series of frequency partials sharing a quasiharmonic interval relation among them. With this percussive sounds of stochastic nature without sinusoidal content are excluded while the stochastic component as intrinsic part of different instruments is considered because of its perceptual significance. A priori knowledge about the pitch and timing of each note is required from a MIDI file including several mono tracks. Thus no probabilistic estimation of the concurrent number of sources and the pitch and timing of note events is considered and to be expected errors are omitted. In contrary to well-known fields of research like source separation or audio decomposition into single mono tracks, the research work as well as the application implemented for this thesis estimates the harmonic partials, the transient part of each note onset and the stochastic residual belonging to one audio object without iteratively subtracting the estimation results from the input stream. Before synthesis musically meaningful transformations like time-scaling or pitch-shifting along with their corresponding scaling factor, the shifting of audio objects in time or the substitution of audio objects by other instrument types are considered. The re-synthesized audio output will be examined by subjective listening and objective evaluation tests against a reference output where the same transformation got applied to the corresponding audio object of the monophonic track before mixing all tracks together.