Abstract
In this paper, we show a framework for partial bot rejection based on spatially supervised text mining from social media messages. We show qualitative results towards the reduction of known bots and give hints on how this cleaning technique can help us in filling gaps of current signals related to human life on Earth based on social media. The bot rejection framework is based on using a spatial signal for supervising a machine learning model with extreme label noise still being able to reject some of the unwanted components of the social media stream. Furthermore, we comment that such models show significant biases and can, therefore, not be used responsibly without bias analysis and mitigation per application.
Original language | English |
---|---|
Pages (from-to) | 68-75 |
Number of pages | 8 |
Journal | GI_Forum |
Volume | 9 |
Issue number | 1 |
DOIs | |
State | Published - 2021 |
Keywords
- Data cleaning
- Social media analysis
- Text mining