Using audio, visual, and lexical features in a multi-modal virtual meeting director

Marc Al-Hames, Benedikt Hörnler, Christoph Scheuermann, Gerhard Rigoll

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

3 Zitate (Scopus)

Abstract

Multi-modal recordings of meetings provide the basis for meeting browsing and for remote meetings. However it is often not useful to store or transmit all visual channels. In this work we show how a virtual meeting director selects one of seven possible video modes. We then present several audio, visual, and lexical features for a virtual director. In an experimental section we evaluate the features, their influence on the camera selection, and the properties of the generated video stream. The chosen features all allow a real- or near real-time processing and can therefore not only be applied to offline browsing, but also for a remote meeting assistant.

OriginalspracheEnglisch
TitelMachine Learning for Multimodal Interaction - Third International Workshop, MLMI 2006, Revised Selected Papers
Seiten63-74
Seitenumfang12
DOIs
PublikationsstatusVeröffentlicht - 2006
Veranstaltung3rd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2006 - Bethesda, MD, USA/Vereinigte Staaten
Dauer: 1 Mai 20064 Mai 2006

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band4299 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Konferenz

Konferenz3rd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2006
Land/GebietUSA/Vereinigte Staaten
OrtBethesda, MD
Zeitraum1/05/064/05/06

Fingerprint

Untersuchen Sie die Forschungsthemen von „Using audio, visual, and lexical features in a multi-modal virtual meeting director“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren