International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 9 - Issue 4, April 2020 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Survey Of Technologies, Tools, Concepts And Issues In Big Data

[Full Text]



Chitrakant Banchhor, N.Srinivasu



Big data, Hadoop, MapReduce, data-intensive computing, high-performance computing.



This paper surveys tools and techniques used in the Big-data and Big-data computing. The principles upon which Big-data and data-intensive computations work have been explored with some outlines provided by various researchers in the field. The paper starts with the traditional cluster computing and further explores about the Hadoop system as a tool to solve Big-data issues. It also covers the significant growth in cloud computing towards hosting MapReduce to use for Data-intensive computations as one of the services available through clouds. It has brief coverage about Microsoft Azure and Amazon clouds for MapReduce services to be provided through Internet.



[1] Gordon Bell, Tony Hey, Alex Szalay, Beyond the data deluge, Science 323 (5919) (2009) pp 1297–1298.
[2] Tony Hey, Stewart Tansley, Kristin Tolle, The fourth paradigm: data-intensive scientific discovery, Microsoft Research2009.http://research.microsoft.com/enus/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf
[3] R. Buyya, “High Performance Cluster Computing: Architectures and System”s, vol. 1, Prentice Hall, 1999.
[4] Semra AYDIN et al, “Building a high performance computing clusters to use in computing course applications”, ScienceDirect, January 4, 2009.
[5] Morton, D., “High Performance Linux Clusters, Linux Journal”, Volume 2007, Issue 163.
[6] The editorial column titled Special issue on Data Intensive Computing in ScienceDirect,
Surendra Byna, NEC Labs America Inc., 4 Independence Way, Suite 200, Princeton,
NJ 08540, United States doi:10.1016/j.jpdc.2010.10.009
[7] Reagan W. Moore, “Digital Libraries and Data Intensive Computing”, San Diego Supercomputer Center, http://dret.net/lectures/ppos-spring11/reading/CDLC-Moore.pdf
[8] Oracle Company, “Oracle Information Architecture: An Architect’s Guide to Big Data”, August 2012.
http://www.oracle.com/technetwork/topics/entarch/articles/oea- big-data-guide-1522052.pdf
[9] Oracle Company, “Oracle: Big Data for the Enterprise”, January 2012
http://www.oracle.com/us/products/database/big-data-for- enterprise-519135.pdf
[10] Doug Laney, 3d Data managment: controlling data volume, velocity and variety, Appl. Delivery Strategies Meta Group (949) (2001).
[11] C.L. Philip Chen, C.-Y. Zhang, Data-intensive applications, challenges, techniques and technologies: A survey on Big Data,Inform.Sci.(2014),http://dx.doi.org/10.1016/j.ins.2014.01.015
[12] Martin Hilbert, Priscila Lopez, The world’s technological capacity to store, communicate, and compute information, Science 332 (6025) 2011 pp 60–65
[13] Simone Ferlin Oliveira et al, “Trends in computation, communication and storage and the consequences for data-intensive science, in: IEEE 14th International Conference on High Performance Computing and Communications, 2012
[14] Lee Hutchinson, “Solid-state revolution: in-depth on how ssds really work”, Ars Technica 2012
[15] A. Pirovano et al, “Scaling analysis of phase-change memory technology”, IEEE Int. Electron Dev. Meeting (2003) 29.6.1–29.6.4.
[16] David Leong, “A new revolution in enterprise storage architecture”, IEEE Potentials 28 (6) (2009) pp 32–33.
[17] Alina Oprea et al, “Space efficient block storage integrity”, in: Proc. 12th Ann. Network and Distributed System Security Symp. 1478 (NDSS 05), 2005.
[18] Qian Wang et al, “Enabling public auditability and data dynamics for storage security in cloud computing”, IEEE Trans. Parallel Distrib. Syst. 22 (5) (2011) pp 847–859.
[19] Chuck Lam, “Hadoop in action”, Manning Publications Co, 2011
[20] Intel White Paper on Big Data Analytics, “Extract, Transform, and Load Big Data with Apache Hadoop”, https://software.intel.com/sites/default/files/article/402274/etl-big-data-
[21] White Paper, EMC Corporation, “HADOOP ON EMC ISILON SCALE-OUT NAS”, December 2012 http://www.emc.com/collateral/software/white-papers/h10528- wp-hadoop-on-isilon.pdf
[22] Joey Jablonski, “ Introduction to Hadoop”, A Dell Technical White Paper, 2011 Dell Inc
[23] Grant Mackey et al, “Introducing Map-Reduce to High End Computing”http://www.pdsiscidac.org/events/PDSW08/resources/papers/mackeyMR_HEC.pdf
[24] Alex Holmes, “Hadoop in practice”, Manning Publications Co, 2012
[25] AMIT GOYAL, “A Survey on Cloud Computing”, in:University of British Columbia Technical Report for CS, 508 (2009)
[26] MarkLogic Connector for Hadoop Developer’s Guide, MarkLogic 7—November, 2013, pp 28-31
[27] Ye Xiaotao1 et al, “Research of High Performance Computing with Clouds”, ISBN 978-952-5726-10-7, Proceedings of the Third International Symposium on Computer Science and Computational Technology (ISCSCT ’10) Jiaozuo, P. R. China, 14-15,August 2010, pp. 289-293
[28] WIKIPEDIA,"High-performance computing", http://en.wikipedia.org/wiki/High- performance_computing
[29] http://research.microsoft.com/en-us/projects/azure/high-perf-computing-on-Windows-azure.pdf
[30] http://www.splunk.com/web_assets/pdfs/secure/Discover_Hadoop.pdf
[31] Cloud Computing Definition, National Institute of Standards and Technology, Version 15,
[32] Ling Chen et al, “Introducing Cloud Computing Topics in Curricula”, Journal of Information Systems Education, Vol. 23(3) Fall 2012, pp 315-324
[33] Jaliya Ekanayake and Geoffrey Fox, “High Performance Parallel Computing with Clouds and Cloud Technologies”, 1st International Conference on Cloud Computing, Oct 19-21, 2009
[34] Amazon Web Services. http://aws.amazon.com
[35] Thilina Gunarathne et al, “MapReduce in the Clouds for Science”, CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp 565-572
[36] http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is-emr.html
[37] Windows Azure Platform, Retrieved April 20, 2010, from Microsoft: http://www.microsoft.com/windowsazure/
[38] Zaiying Liu et al, “A Sketch of Big Data Technologies”, 2013 Seventh International Conference on Internet Computing for Engineering and Science, 2013 IEEE, DOI
10.1109/ICICSE.2013.13, pp 26 – 29