Abstract
Social media users are generating data on an unprecedented scale. Distributed storage systems are often used to cope with explosive data growth. Data partitioning and replication are two interrelated data placement issues affecting the interserver traffic caused by user-initiated read and write operations in distributed storage systems. This paper investigates how to minimize the interserver traffic among a cluster of social media servers through joint data partitioning and replication optimization. We formally define the problem and study its hardness. We then propose a traffic-optimized partitioning and replication (TOPR) method to continuously adapt data placement according to various dynamics. Evaluations with real Twitter and LiveJournal social graphs show that TOPR not only reduces the interserver traffic significantly but also saves much storage cost of replication compared to state-of-the-art methods. We also benchmark TOPR against the offline optimum by a binary linear program.
| Original language | English |
|---|---|
| Pages (from-to) | 1008-1023 |
| Number of pages | 16 |
| Journal | IEEE Transactions on Multimedia |
| Volume | 20 |
| Issue number | 4 |
| DOIs | |
| State | Published - Apr 2018 |
Keywords
- data replication
- distributed storage
- graph partitioning
- Social media
Fingerprint
Dive into the research topics of 'Traffic-Optimized Data Placement for Social Media'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver