Skip to main navigation Skip to search Skip to main content

Two 1%s Don't make a whole: Comparing simultaneous samples from Twitter's Streaming API

  • Carnegie Mellon University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

43 Scopus citations

Abstract

We compare samples of tweets from the Twitter Streaming API constructed from different connections that tracked the same popular keywords at the same time. We find that on average, over 96% of the tweets seen in one sample are seen in all others. Those tweets found only in a subset of samples do not significantly differ from tweets found in all samples in terms of user popularity or tweet structure. We conclude they are likely the result of a technical artifact rather than any systematic bias. Practically, our results show that an infinite number of Streaming API samples are necessary to collect "most" of the tweets containing a popular keyword, and that findings from one sample from the Streaming API are likely to hold for all samples that could have been taken. Methodologically, our approach is extendible to other types of social media data beyond Twitter.

Original languageEnglish
Title of host publicationSocial Computing, Behavioral-Cultural Modeling, and Prediction - 7th International Conference, SBP 2014, Proceedings
PublisherSpringer Verlag
Pages75-83
Number of pages9
ISBN (Print)9783319055787
DOIs
StatePublished - 2014
Event7th International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, SBP 2014 - Washington, DC, United States
Duration: Apr 1 2014Apr 4 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8393 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, SBP 2014
Country/TerritoryUnited States
CityWashington, DC
Period04/1/1404/4/14

Fingerprint

Dive into the research topics of 'Two 1%s Don't make a whole: Comparing simultaneous samples from Twitter's Streaming API'. Together they form a unique fingerprint.

Cite this