When is it biased? Assessing the representativeness of twitter's streaming API

Fred Morstatter, Jürgen Pfeffer, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

90 Scopus citations

Abstract

Twitter shares a free 1% sample of its tweets through the Streaming API". Recently, research has pointed to evidence of bias in this source. The methodologies proposed in previous work rely on the restrictive and expensive Firehose to find the bias in the Streaming API data. We tackle the problem of finding sample bias without costly and restrictive Firehose data. We propose a solution that focuses on using an open data source to find bias in the Streaming API.

Original languageEnglish
Title of host publicationWWW 2014 Companion - Proceedings of the 23rd International Conference on World Wide Web
PublisherAssociation for Computing Machinery, Inc
Pages555-556
Number of pages2
ISBN (Electronic)9781450327459
DOIs
StatePublished - 7 Apr 2014
Externally publishedYes
Event23rd International Conference on World Wide Web, WWW 2014 - Seoul, Korea, Republic of
Duration: 7 Apr 201411 Apr 2014

Publication series

NameWWW 2014 Companion - Proceedings of the 23rd International Conference on World Wide Web

Conference

Conference23rd International Conference on World Wide Web, WWW 2014
Country/TerritoryKorea, Republic of
CitySeoul
Period7/04/1411/04/14

Keywords

  • Big data
  • Data sampling
  • Sampling bias
  • Twitter analysis

Fingerprint

Dive into the research topics of 'When is it biased? Assessing the representativeness of twitter's streaming API'. Together they form a unique fingerprint.

Cite this