IJSTR >> Volume 9 - Issue 1, January 2020 Edition

Analytics For Healthcare Using Hadoop Mapreduce, Apache Spark And In Cloud Services

Dr.K.Sharmila, Dr.T,Kamalakannan



AWS, Big data, Cloud computing, Diabetic Mellitus , Hadoop MapReduce, K-means, SVM algorithm, spark.



Decision making and knowledge discovery from voluminous big data is a challenging problem. Extracting useful information from the enormous amount of data is highly complex, difficult and time consuming. Therefore standard data mining algorithms are essential for the analysis of big data with different platform. This investigation focuses on benchmarking of parallel processing platforms and Cloud computing environment. Cloud computing facility has emerged as service oriented computing model to deliver infrastructure, platform and applications as services from the providers to the consumers. This study utilized the services provided by Amazon Web Services as an effective metaphor for the management of large scale data processing in elastically scalable computing and for storage. This paper also discusses about the framework of MapReduce integrated with K-means and SVM machine learning techniqes algorithm on standalone environment and spark to predict the diabetic related diseases from real-time data set collected in various districts of Tamil Nadu. Ultimately, the present study has established that parallelization using Apache Hadoop with spark shows a better performance compared with a standalone model in a single machine. With the expansion of Information and communication technology, the health care industry also is producing extensively large data day by day. In developing countries like India, the accumulation of data is large and there exist various problems. This type of Big Data analysis will hopefully help the diabetes patients and physicians to predict the disease and to treat them at an early.



