Conference paper Open Access

On Audio Processes in the Artificial Intelligence [self.]

Øyvind Brandtsegg; Axel Tidemann

This paper describes [self.], an open source art installation that embodies artificial intelligence (AI) in order to learn, react, and respond to stimuli from its immediate environment. Biologically inspired models are implemented to achieve this behavior, and Csound is used for most parts of the audio processing involved in the system. The artificial intelligence is physically represented by a robot head, built on a modified moving head for stage lighting. Everything but the motors of the stage lighting unit was removed and a projector, camera and microphones added. No form of knowledge or grammar have been implemented in the AI, the system starts in a ``tabula rasa state and learns everything via its own sensory channels, forming categories in a bottom-up fashion. The robot recognizes sounds and faces, and is able to recognize similar sounds, link them with the corresponding faces, and use the knowledge of past experiences to form new sentences. Since the utterances of the AI is solely based on audio and video items it has learned from the interaction with people, an insight into the learning process (i.e. what it has learned from who) can be glimpsed. This collage-like composition has guided several design choices regarding the aesthetics of the audio and video output. This paper will focus on the audio processes of the system, herein audio recording, segmentation, analysis, processing and playback.

Files (1.7 MB)
Name Size
1.7 MB Download
All versions This version
Views 5555
Downloads 3939
Data volume 65.8 MB65.8 MB
Unique views 5454
Unique downloads 3838


Cite as