Real-time acoustic source localization in noisy environments for human-robot multimodal interaction

Vlad M. Trifa, Ansgar Koene, Jan Morén, Gordon Cheng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

35 Scopus citations

Abstract

Interaction between humans involves a plethora of sensory information, both in the form of explicit communication as well as more subtle unconsciously perceived signals. In order to enable natural human-robot interaction, robots will have to acquire the skills to detect and meaningfully integrate information from multiple modalities. In this article, we focus on sound localization in the context of a multi-sensory humanoid robot that combines audio and video information to yield natural and intuitive responses to human behavior, such as directed eye-head movements towards natural stimuli. We highlight four common sound source localization algorithms and compare their performance and advantages for real-time interaction. We also briefly introduce an integrated distributed control framework called DVC, where additional modalities such as speech recognition, visual tracking, or object recognition can easily be integrated. We further describe the way the sound localization module has been integrated in our humanoid robot, CB.

Original languageEnglish
Title of host publication16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN
Pages393-398
Number of pages6
DOIs
StatePublished - 2007
Externally publishedYes
Event16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN - Jeju, Korea, Republic of
Duration: 26 Aug 200729 Aug 2007

Publication series

NameProceedings - IEEE International Workshop on Robot and Human Interactive Communication

Conference

Conference16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN
Country/TerritoryKorea, Republic of
CityJeju
Period26/08/0729/08/07

Fingerprint

Dive into the research topics of 'Real-time acoustic source localization in noisy environments for human-robot multimodal interaction'. Together they form a unique fingerprint.

Cite this