International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 8 - Issue 7, July 2019 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Analysing Huge Data Collection And Comparing Through Algorithms: KNN, Naive And Collaborative Filtering & Hybrid

[Full Text]



Pooja Mudgil, Shivani Gautam, Uditta Chhabra, Mansi Jadaun, Paras Jain, Vikas Singh



Recommendation system, Naïve Bayes, K-Nearest Neighbour (KNN), Collaborative filtering, java, Hadoop tool, Hive.



Recommendation systems are used to obtain and analyse huge datasets of business organisations and industries thus, helping as well as allowing them to identify the best throughput and optimised options for their increase in efficiency and performance. This technology gains its merits in different other technologies for analysis of data. Organisations are able to gain if they are able to recommend suitable products to variant users by use of correct set of tools. Correct product recommended to customers by companies leads to congeniality for either ends. If, used at wide scale can lead to increase in sale of products, increasing profit margins and satisfied customers. This paper presents the effectiveness of recommendation system and its best suitable algorithm that could be used according to the data set available for the corresponding increase in efficiency and productivity by clubbing results from various other researches with the obtained results from analysing of datasets obtained from Kaggle using three algorithms: Naïve Bayes, KNN, and collaborative filtering. For any business, production and growth are in direct correlation with the user’s usage and requirements which is successful only when a particular user is able to obtain the products correspondingly at the same time and it could be fast and efficient when the results of recommendation system amplify the user’s choices with preferences. Therefore, the studied patterns obtained from researches and through the dataset, implementations of algorithms and comparing them for obtaining an accurate solution for recommendation systems.



[1]. Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). "From Data Mining to Knowledge Discovery in Databases, 17 December, 2008
[2]. P.Resnick, H.R. Varian, Recommender systems published in Commun ACM, 40(3)(1997), pp. 56-58, 10.1145/245108.24512
[3]. L.S. Chen, F.H. Hsu, M.C. Chen, Y.C. Hsu, Developing recommender systems with the consideration of product profitability for sellers published in Int. J Inform Sci, 178(4) (2008), pp. 1032-1048
[4]. Item-Based Top-N Recommendation Algorithms∗ Mukund Deshpande and George Karypis University of Minnesota, Department of Computer Science Minneapolis, MN 55455
[5]. Souvik Debnath, Niloy Ganguly, Pabitra Mitra, “Feature Weighting in Content Based Recommendation System Using Social Network Analysis” WWW 2008, April 21–25, 2008, Beijing, China. ACM 978-1-60558-085-2/08/04.
[6]. Jiang L., Zhang H., Su J. (2005) Learning k-Nearest Neighbor Naïve Bayes for Ranking. In: Li X., Wang S., Dong Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science, vol 3584. Springer, Berlin, Heidelberg
[7]. Ron Kohav, “Scaling up the accuracy of naïve bayes classifiers: A decision tree” KDD’96 Proceedings of the Second International Confernce on knowledge Discovery and Data mining. Pages 202-207
[8]. Pooja Mudgil, Paras Jain, Vikas Singh, “Data Analytics and Data monitoring Based on Database Recommendation – A Comparison”, International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 5 Issue 2, pp. 1166-1170, March-April 2019. Available at doi: https://doi.org/10.32628/CSEIT1952312, Journal URL: http://ijsrcseit.com/CSEIT1952312.
[9]. P. Langley, W. Iba, and K. Thompson. An analysis of Bayesian classifiers. In 10th national conference on Artificial Intelligence, pages 223–228. AAAI Press, 1992.
[10]. M.J. Pazzani, A framework for collaborative, content-based and demographic filtering, Artific Intell Rev, 13 (1999), pp. 393-408 No. 5(6)
[11]. The Optimality of Naïve Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada
[12]. J.A. Hoeting, D. Madigan, A.E. Raftery, and C.T. Volinsky. Bayesian model averaging: A tutorial. Statistical Science, 14(4):382–417, 1999.
[13]. M. Buckland, F. Gey, The Relationship Between Recall and Precision, Journal of the American Society for Information Science, 45(1):12--19, 1994
[14]. Vijay Raghavan, Peter Bollmann, Gwang S. Jung, “ A critical investigation of recall and precision as measures of retrieval system performance”, published in Journal ACM Transactions on Information Systems (TOIS) Volume 7 Issue 3, July 1989, pages 205-229
[15]. Bookstein, A.(1974). “The anomalous behaviour of precsion in the Swets model, and its resolution, Journal of Documentation, 21, 374-380.