International Journal of Scientific & Technology Research

IJSTR@Facebook IJSTR@Twitter IJSTR@Linkedin
Home About Us Scope Editorial Board Blog/Latest News Contact Us

IJSTR >> Volume 2- Issue 9, September 2013 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Grid Service Reliability Modeling Considering Fault Recovery

[Full Text]



Pallavi Rahinj, S. M. Sabale



Index Terms: Grid service reliability, Fault recovery, Task scheduling, Resource management system, Star topology, Local node fault recovery, Remote fault recovery.



Abstract: Grid computing is a recently developed technology. Although the developmental tools and techniques for the grid have been extensively studied, yet some important issues, e.g., grid service reliability and task scheduling in the grid, have not been sufficiently studied. For some grid services which have large subtasks requiring time-consuming computation, the reliability of grid service could be rather low. To resolve this problem, this paper introduces Local Node Fault Recovery (LNFR) mechanism into grid systems, and presents an in-depth study on grid service reliability modeling and analysis with this kind of fault recovery. To make LNFR mechanism practical, some constraints, i.e. the life times of subtasks, and the numbers of recoveries performed in grid nodes, are introduced; and grid service reliability models under these practical constraints are developed. Also uses new algorithm which is based on min-min algorithm for task scheduling.



[1]. I. Foster, C. Kesselman and S. Tuecke, “The anatomy of the grid: Enabling scalable virtual organizations”, International Journal of High Performance Computing Applications, vol. 15, pp. 200-222, 2001.

[2]. I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure, Morgan-Kaufmann, 1998.

[3]. Y. S. Dai, M. Xie, and K. L. Poh, “Reliability of grid service systems,” Computers and Industrial Engineering, vol. 50, no. 1–2, pp. 130–147, 2006.

[4]. G. Levitin, Y. S. Dai, and B. H. Hanoch, “Reliability and performance of star topology grid service with precedence constraints on subtask execution,” IEEE Trans. Reliability, vol. 55, no. 3, pp. 507–515, 2006.

[5]. G. Levitin and Y. S. Dai, “Grid service reliability and performance in grid system with star topology,” Reliability Engineering and System Safety, vol. 92, no. 1, pp. 40–46, 2007.

[6]. Y. S. Dai, G. Levitin, and X. L. Wang, “Optimal task partition and distribution in grid service system with common cause failures,” Future Generation Computer Systems, vol. 23, no. 2, pp. 209–218, 2007.

[7]. Y. S. Dai, Y. Pan, and X. K. Zou, “A hierarchical modeling and analysis for grid service reliability,” IEEE Trans. Computers, vol. 56, no. 5, pp. 681–691, 2007.

[8]. T. Paul and X. Jie, “Fault tolerance within a grid environment,” in Proceedings of UK e-Science All Hands Meeting, 2003.

[9]. M. Affaan and M. A. Ansari, “Distributed fault management for computational grids,” in Proceedings of the Fifth International Conference on Grid and Cooperative Computing, 2006.

[10]. L. Jin, W. Q. Tong, J. Q. Tang, and B.Wang, “A fault-tolerance mechanism in grid,” in Proceedings of IEEE International Conference on Industrial Informatics, 2003.

[11]. A. Heddaya and A. Helal, Reliability, Availability, Dependability and Performability: A User-Centered View 1997, Technical Report.

[12]. Y. S. Dai and X. L. Wang, “Optimal resource allocation on grid systems for maximizing service reliability using a genetic algorithm,” Reliability Engineering and System Safety, vol. 91, no. 9, pp. 1071–1082, 2006.

[13]. Suchang Guo, Hong-Zhong Huang, Member IEEE, Zhonglai Wang, and Min Xie, Grid Service Reliability Modeling and Optimal Task Scheduling Considering Fault Recovery, IEEE Trans. on Reliability, Vol. 60, no. 1, March 2011.

[14]. Y. S. Dai and G. Levitin, “Optimal resource allocation for maximizing performance and reliability in tree-structured grid services,” IEEE Trans. Reliability, vol. 56, no. 3, pp. 444–453, 2007.

[15]. Fangpeng Dong and Selim G. Akl, “Scheduling Algorithms for Grid Computing: State of the Art and Open Problems”, Technical Report No. 2006-504, School of Computing, Queen’s University Kingston, Ontario January 2006.