International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 8 - Issue 11, November 2019 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

An Efficient Feature Extraction Method For Mining Social Media

[Full Text]



V Mageshwari, Dr. I. Laurence Aroquiaraj



Classification, BOW, feature extraction, HIV/AIDS, pre-processing, tf-idf, twitter



Social media facilitates the users to exchange their opinion, thoughts and ideas. The advantage of sharing an information through social media is, it will widespread the content quickly. There are so many social media platforms among which Twitter is one of them. Through twitter the user can communicate the information briefly. So many real-world issues are discussed on twitter, in which the discussion about HIV/AIDS is ranked as one of the topmost topics. Due to the advancement of social media many users have come forward to discuss about this societal topic. These kinds of discussion will help the communication campaigns to promote better HIV/AIDS education. In this work tweets were collected by the keywords including HIV and AIDS. Following the pre-processing steps, feature extraction has been carried out. Feature extraction is very crucial step in mining twitter because the data is in unstructured format. So, increasing the efficiency of feature extraction will improve the outcome of classification task. In this work an efficient feature extraction method has been proposed which gives a better result when compared to existing.



[1] Akrini Krouska & Christos Troussas, “The Effect of Preprocessing Techniques on Twitter Sentiment analysis”, Research Gate, July 2016, DOI:10.1109/IISA.2016.7785373.
[2] Amit G. Shirbhate, Sachin N. Deshmukh, “Feature Extraction for Sentiment Classification on Twitter Data”, International Journal of Science and Research, ISSN: 2319-7064, Volume 5 Issue 2, February 2016.
[3] Ammar Ismael Kadhim, Yu-N Cheah, “Improving TF-IDF with Singular Value Decomposition (SVD) for Feature Extraction on Twitter”, 3rd International Engineering Conference on Development in Civil & Computer Engineering Applications, 2017, ISSN: 24096997
[4] Ankita Pal, “Principal Component Analysis of TF-IDF In Click Through Rate Prediction”, International Journal of New Technology and Research (IJNTR), ISSN: 2454-4116, Volume-4, Issue-12, December 2018, pp 24-26.
[5] Arjun Srinivas Nayak & Ananthu P Kanive, “Survey on Pre-Processing Techniques for Text Mining”, International Journal of Engineering And Computer Science, ISSN: 2319-7242, Volume 5 Issue 6 June 2016,Page No. 16875-16879.
[6] Bholane Savita & Prof.Deipali Gore, “Sentiment Analysis on Twitter Data Using Support Vector Machine”, IJCST, Volume 4, Issue 3, May-Jun 2016.
[7] Aymen Abu-Errub, “Arabic Text Classification Algorithm Using TF-IDF and Chi Square Measurements”, International Journal of Computer Applications, ISSN: 0975-8887, Volume 93 – No 6, May 2014.
[8] Emma Haddi & Xiaohui, “The Role of Text Pre-processing in Sentiment Analysis”, Information Technology and Quantitative Management (ITQM2013), Procedia Computer Science 17 (2013)26-32.
[9] Indra S.T, “Using Logistic Regression Method to Classify Tweets into the Selected Topics”, ICACSIS, IEEE, 2016.
[10] Mageshwari V, Dr I. Laurence Aroquiaraj, “Big Data in Health Care Revolution – A Survey”, International Research Journal of Engineering and Technology, Volume 3 Issue 9, September 2016, ISSN 2395-0056.
[11] V. Mageshwari, Dr.I. Laurence Aroquiaraj, “Social Media Mining for Analyzing HIV/AIDS – A Preliminary Study”, IJIACS, ISSN: 2347-8616, Volume 6, Issue 9, September 2017.
[12] V Mageshwari, Dr.I. Laurence Aroquiaraj, “The Importance of Text Pre-Processing in Twitter Mining”, International Journal of Scientific Research in Computer Science Applications and Management Studies, ISSN: 2319-1953, Volume 7, Issue 4, July 2018.
[13] Rene Clausen Nielsen, “Social Media Monitoring of Discrimination and HIV Testing in Brazil, 2014-2015”, AIDS Behav (2017) 21:S114-S120, DOI: 10.1007/s10461-017-1753-2
[14] Sean D. Young, Wenchao Yu, “Towards Automating HIV Identification: Machine Learning for Rapid Identification of HIV-related Social Media Data”, J Acquir Immune Defic Syndr, February 01 2017, 74(Suppl): S128-S131, doi: 10.1097/QAI.0000000000001240.
[15] Yassine AL AMRANI, Mohammed LAZAAR, “Random Forest and Support Vector Machine based Hybrid Approach to Sentiment Analysis”, The First International Conference on Intelligent Computing in Data Sciences, Procedia Computer Science 127 (2018) 511-520.
[16] Tajinder Singh and Madhu Kumari, “Role of Text Pre-processing in Twitter Sentiment Analysis”, IMCIP-2016, Procedia Computer Science 89 (2016) 569-554.
[17] ZHANG Yun-tao, “An improved TF-IDF approach for Text Classification”, Journal of Zhejiang University SCIENCE, ISSN: 1009-3095, 2005.