TY - GEN
T1 - Applying Bayesian belief networks in approximate string matching for robust keyword-based retrieval
AU - Schuller, Björn
AU - Müller, Ronald
AU - Rigoll, Gerhard
AU - Lang, Manfred
PY - 2004
Y1 - 2004
N2 - In this work we present a novel approach towards robust keyword-based retrieval. Thereby Bayesian Belief Networks are applied in a word-model based Approximate String Matching algorithm. Apart from proved reliable performance of a working implementation on standard sources like digital text, wholly probabilistic modeling allows for integration of confidence measures and hypotheses obtained from preprocessing stages like handwriting recognition or optical character recognition respecting uncertainties on the lower levels. Furthermore a flexible method to include the modeling of specific error types deriving from humans and various input sources is provided. The remarkable performance of the algorithms presented was tested during extensive evaluation with respect to Levenstein-Distance, which can be seen as basis of state-of-the-art methods in this research field. The tests ran on a 14K database containing common international music titles and four 10K databases consisting of the most frequently used words in English, German, French, and Dutch language.
AB - In this work we present a novel approach towards robust keyword-based retrieval. Thereby Bayesian Belief Networks are applied in a word-model based Approximate String Matching algorithm. Apart from proved reliable performance of a working implementation on standard sources like digital text, wholly probabilistic modeling allows for integration of confidence measures and hypotheses obtained from preprocessing stages like handwriting recognition or optical character recognition respecting uncertainties on the lower levels. Furthermore a flexible method to include the modeling of specific error types deriving from humans and various input sources is provided. The remarkable performance of the algorithms presented was tested during extensive evaluation with respect to Levenstein-Distance, which can be seen as basis of state-of-the-art methods in this research field. The tests ran on a 14K database containing common international music titles and four 10K databases consisting of the most frequently used words in English, German, French, and Dutch language.
UR - http://www.scopus.com/inward/record.url?scp=11244305954&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:11244305954
SN - 0780386035
SN - 9780780386037
T3 - 2004 IEEE International Conference on Multimedia and Expo (ICME)
SP - 1999
EP - 2002
BT - 2004 IEEE International Conference on Multimedia and Expo (ICME)
T2 - 2004 IEEE International Conference on Multimedia and Expo (ICME)
Y2 - 27 June 2004 through 30 June 2004
ER -