TY - GEN
T1 - A client-centric grid knowledgebase
AU - Kola, George
AU - Kosar, Tevfik
AU - Livny, Miron
PY - 2004
Y1 - 2004
N2 - Grid computing brings with it additional complexities and unexpected failures. Just keeping track of our jobs traversing different grid resources before completion can at times become tricky. In this paper, we introduce a client-centric grid knowledgebase that keeps track of the job performance and failure characteristics on different grid resources as observed by the client. We present the design and implementation of our prototype grid knowledgebase and evaluate its effectiveness on two real life grid data processing pipelines: NCSA image processing pipeline and WCER video processing pipeline. It enabled us to easily extract useful job and resource information and interpret them to make better scheduling decisions. Using it, we were able to understand failures better and were able to devise innovative methods to automatically avoid and recover from failures and dynamically adapt to grid environment improving fault-tolerance and performance.
AB - Grid computing brings with it additional complexities and unexpected failures. Just keeping track of our jobs traversing different grid resources before completion can at times become tricky. In this paper, we introduce a client-centric grid knowledgebase that keeps track of the job performance and failure characteristics on different grid resources as observed by the client. We present the design and implementation of our prototype grid knowledgebase and evaluate its effectiveness on two real life grid data processing pipelines: NCSA image processing pipeline and WCER video processing pipeline. It enabled us to easily extract useful job and resource information and interpret them to make better scheduling decisions. Using it, we were able to understand failures better and were able to devise innovative methods to automatically avoid and recover from failures and dynamically adapt to grid environment improving fault-tolerance and performance.
UR - https://www.scopus.com/pages/publications/20444460754
U2 - 10.1109/CLUSTR.2004.1392642
DO - 10.1109/CLUSTR.2004.1392642
M3 - Conference contribution
AN - SCOPUS:20444460754
SN - 0780386949
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 431
EP - 438
BT - 2004 IEEE International Conference on Cluster Computing, ICCC 2004
T2 - 2004 IEEE International Conference on Cluster Computing, ICCC 2004
Y2 - 20 September 2004 through 23 September 2004
ER -