Skip to main navigation Skip to search Skip to main content

Exploring patterns of identity usage in tweets

  • Carnegie Mellon University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Sociologists have long been interested in the ways that iden-tities, or labels for people, are created, used and applied across various social contexts. The present work makes two contributions to the study of identity, in particular the study of identity in text. We first consider the following novel NLP task: given a set of text data (here, from Twitter), label each word in the text as being representative of a (possibly multi-word) identity. To address this task, we develop a comprehensive feature set that leverages several avenues of recent NLP work on Twitter and use these features to train a supervised classiffier. Our model outperforms a surprisingly strong rule-based baseline by 33%. We then use our model for a case study, applying it to a large corpora of Twitter data from users who actively discussed the Eric Garner and Michael Brown cases. Among other findings, we observe that the identities used by individuals differ in interesting ways based on social context measures derived from census data.

Original languageEnglish
Title of host publication25th International World Wide Web Conference, WWW 2016
PublisherInternational World Wide Web Conferences Steering Committee
Pages401-412
Number of pages12
ISBN (Electronic)9781450341431
DOIs
StatePublished - 2016
Event25th International World Wide Web Conference, WWW 2016 - Montreal, Canada
Duration: Apr 11 2016Apr 15 2016

Publication series

Name25th International World Wide Web Conference, WWW 2016

Conference

Conference25th International World Wide Web Conference, WWW 2016
Country/TerritoryCanada
CityMontreal
Period04/11/1604/15/16

Fingerprint

Dive into the research topics of 'Exploring patterns of identity usage in tweets'. Together they form a unique fingerprint.

Cite this