TY - GEN
T1 - Finding non-redundant multi-word events on twitter
AU - Günnemann, Nikou
AU - Pfeffer, Jürgen
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/8/25
Y1 - 2015/8/25
N2 - Twitter is a pervasive technology, with hundreds of millions of users serving as sensors that provide eyewitness accounts of events on the ground. In case of popular events, these sensors start to broadcast news by tweeting to their followers, and to the world. Within minutes these tweets can attract attention and also serve as a primary information source for traditional media. Given a huge set of tweets, the key questions are: (1) How can we detect informative events in general? (2) How can we distinguish relevant events from others? In this paper we tackle these challenges with a statistical model for detecting events by spotting significant frequency deviations of the words' frequency over time. Besides single word events, our model also accounts for events composed of multiple co-occurring words, thus, providing much richer information. Our statistical process is complemented with an optimization algorithm to extract only non-redundant events, overall, providing the user with a succinct summary of the current events. We used our model to analyze 24 million geotagged tweets that have been sent in the US from April 9 to April 22, 2013 - the time period of the Boston marathon bombing - and we show that our approach can create multi-word events that efficiently summarize real-world events.
AB - Twitter is a pervasive technology, with hundreds of millions of users serving as sensors that provide eyewitness accounts of events on the ground. In case of popular events, these sensors start to broadcast news by tweeting to their followers, and to the world. Within minutes these tweets can attract attention and also serve as a primary information source for traditional media. Given a huge set of tweets, the key questions are: (1) How can we detect informative events in general? (2) How can we distinguish relevant events from others? In this paper we tackle these challenges with a statistical model for detecting events by spotting significant frequency deviations of the words' frequency over time. Besides single word events, our model also accounts for events composed of multiple co-occurring words, thus, providing much richer information. Our statistical process is complemented with an optimization algorithm to extract only non-redundant events, overall, providing the user with a succinct summary of the current events. We used our model to analyze 24 million geotagged tweets that have been sent in the US from April 9 to April 22, 2013 - the time period of the Boston marathon bombing - and we show that our approach can create multi-word events that efficiently summarize real-world events.
UR - https://www.scopus.com/pages/publications/84962532091
U2 - 10.1145/2808797.2809390
DO - 10.1145/2808797.2809390
M3 - Conference contribution
AN - SCOPUS:84962532091
T3 - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
SP - 520
EP - 525
BT - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
A2 - Pei, Jian
A2 - Tang, Jie
A2 - Silvestri, Fabrizio
PB - Association for Computing Machinery, Inc
T2 - IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
Y2 - 25 August 2015 through 28 August 2015
ER -