TY - GEN
T1 - Elastic data routing in cluster-based deduplication systems
AU - Wang, Yufeng
AU - Tang, Shaojie
AU - Tan, Chiu C.
PY - 2014
Y1 - 2014
N2 - As a space-efficient approach to data archive and backup, data deduplication is becoming increasingly popular in storage systems. However, as the data growing rapidly in data centers, single-node storage node is no longer be able to provide the corresponding throughput and capacities as expected. Building deduplication clusters is considered as a promising strategy to leverage such bottle-neck on single-node system. However, deduplication relies on how much the system knows about information of previous stored data. The single-node system obviously obtains all such information and is able to detect duplicate data there; however storage nodes in cluster-based system cannot know information on other nodes. It is nontrivial to route data intelligently enough so that the system could support deduplication performance comparable to that of a single-node system, while also at a trivial cost. In this paper, we propose an elastic data routing strategy, aiming to achieve deduplication performance comparable to state-of-the-art, while require much less computation resources.
AB - As a space-efficient approach to data archive and backup, data deduplication is becoming increasingly popular in storage systems. However, as the data growing rapidly in data centers, single-node storage node is no longer be able to provide the corresponding throughput and capacities as expected. Building deduplication clusters is considered as a promising strategy to leverage such bottle-neck on single-node system. However, deduplication relies on how much the system knows about information of previous stored data. The single-node system obviously obtains all such information and is able to detect duplicate data there; however storage nodes in cluster-based system cannot know information on other nodes. It is nontrivial to route data intelligently enough so that the system could support deduplication performance comparable to that of a single-node system, while also at a trivial cost. In this paper, we propose an elastic data routing strategy, aiming to achieve deduplication performance comparable to state-of-the-art, while require much less computation resources.
UR - https://www.scopus.com/pages/publications/84904460247
U2 - 10.1109/INFCOMW.2014.6849183
DO - 10.1109/INFCOMW.2014.6849183
M3 - Conference contribution
AN - SCOPUS:84904460247
SN - 9781479930883
T3 - Proceedings - IEEE INFOCOM
SP - 117
EP - 118
BT - 2014 IEEE Conference on Computer Communications Workshops, INFOCOM WKSHPS 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 IEEE Conference on Computer Communications Workshops, INFOCOM WKSHPS 2014
Y2 - 27 April 2014 through 2 May 2014
ER -