TY - JOUR
T1 - Distributing recognition in computational paralinguistics
AU - Zhang, Zixing
AU - Coutinho, Eduardo
AU - Deng, Jun
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2010-2012 IEEE.
PY - 2014/10/1
Y1 - 2014/10/1
N2 - In this paper, we propose and evaluate a distributed system for multiple Computational Paralinguistics tasks in a client-server architecture. The client side deals with feature extraction, compression, and bit-stream formatting, while the server side performs the reverse process, plus model training, and classification. The proposed architecture favors large-scale data collection and continuous model updating, personal information protection, and transmission bandwidth optimization. In order to preliminarily investigate the feasibility and reliability of the proposed system, we focus on the trade-off between transmission bandwidth and recognition accuracy. We conduct large-scale evaluations of some key functions, namely, feature compression/decompression, model training and classification, on five common paralinguistic tasks related to emotion, intoxication, pathology, age and gender. We show that, for most tasks, with compression ratios up to 40 (bandwidth savings up to 97.5 percent), the recognition accuracies are very close to the baselines. Our results encourage future exploitation of the system proposed in this paper, and demonstrate that we are not far from the creation of robust distributed multi-task paralinguistic recognition systems which can be applied to a myriad of everyday life scenarios.
AB - In this paper, we propose and evaluate a distributed system for multiple Computational Paralinguistics tasks in a client-server architecture. The client side deals with feature extraction, compression, and bit-stream formatting, while the server side performs the reverse process, plus model training, and classification. The proposed architecture favors large-scale data collection and continuous model updating, personal information protection, and transmission bandwidth optimization. In order to preliminarily investigate the feasibility and reliability of the proposed system, we focus on the trade-off between transmission bandwidth and recognition accuracy. We conduct large-scale evaluations of some key functions, namely, feature compression/decompression, model training and classification, on five common paralinguistic tasks related to emotion, intoxication, pathology, age and gender. We show that, for most tasks, with compression ratios up to 40 (bandwidth savings up to 97.5 percent), the recognition accuracies are very close to the baselines. Our results encourage future exploitation of the system proposed in this paper, and demonstrate that we are not far from the creation of robust distributed multi-task paralinguistic recognition systems which can be applied to a myriad of everyday life scenarios.
KW - Computational paralinguistics
KW - distributed recognition system
KW - emotion
KW - split vector quantization
UR - http://www.scopus.com/inward/record.url?scp=84915735260&partnerID=8YFLogxK
U2 - 10.1109/TAFFC.2014.2359655
DO - 10.1109/TAFFC.2014.2359655
M3 - Article
AN - SCOPUS:84915735260
SN - 1949-3045
VL - 5
SP - 406
EP - 417
JO - IEEE Transactions on Affective Computing
JF - IEEE Transactions on Affective Computing
IS - 4
M1 - 6906228
ER -