International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 9 - Issue 3, March 2020 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Finding Similar Content Posts Using Semantic Textual Similarity Based On Text Segmentation Through Natural Language Processing

[Full Text]



Rohan C. Tadvi, Vrushali A. Chakkarwar



congruence, relatedness, clustering, classification, semantic, corpuses, forum, preprocessing.



Posts in the forums are dispersed in database where determining the congruence among the text posts in web forums is cumbersome task. Congruence is relevant property while text clustering and text classification. Traditionally the documents were searched with the collation of keywords or set of terms from the posts. Proposed system posts are contemplated as corpus of words where each entity in corpus has some individual weightage where terms and words are also found in another corpuses as well. To fulfill the objective with common goal there should be some relatedness among the corpus of different posts in different or same forum which provides the similar motive the user needed to deliver. Congruence is calculated by applying a score to common terms calculated in preprocessing. Semantic relatedness score of corpus differs for every corpus depending on the relatedness in corpuses. Posts are divided into segments at particular instances. Of these segments the corpuses are created and text features are extracted and monitored by identifying congruence of keywords. The common terms extracted are evaluated using process by combination of different Semantic Textual algorithms. After calculating the similarity most identical posts are displayed to user on threshold basis.



[1]. D. Papadimitriou, G. Koutrika,Y.Velegrakis and J. Mylopoulos, “Finding Related Forum Posts through Content Similarity over Intention-Based Segmentation” in IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 9, pp. 1860-1873, Sep 2017.
[2]. Hernani Costaa, Gloria Corpas Pastora and Ruslan Mitkovb, "Measuring the Relatedness between Documents in Comparable Corpora" in Conference: 11th Int. Conf. on Terminology and Artificial Intelligence (TIA'15). Granada, Spain. pp.29-37., At Granada, Spain Nov 2015.
[3]. Ming Liua,1, Bo Lang a and Zepeng Gu, "Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology" arxiv 2017.
[4]. Zhao Zingling ,Zhang Huiyan,Cui Baojiang ,"Sentence Similarity Based on Semantic Vector Model",2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.
[5]. Wael H. Gomaa, Aly H. Fahmy "A Survey of Text Similarity Approaches" International Journal of Computer Applications (0975 – 8887)Volume 68– No.13, April 2013.
[6]. Jiyi Li, Toshiyuki Shimizu, Masatoshi Yoshikawa "Document Similarity Intention based Segmentation Approach".
[7]. Muthuselvi M,Annie John,Anslin Jenisha S,Archana S,Manimegalai M,Student, Department of CSE, Univer"Text Mining from CMS Forums- an Intention based Segmentation Approach"Journal of Network Communications and Emerging Technologies (JNCET)Volume 8, Issue 4, April (2018).
[8]. V. Sowmya Vishnu Vardhan B, Bhadri Raju M S V S,"Influence of Token Similarity Measures for Semantic Textual Similarity", 2016 IEEE 6th International Conference on Advanced Computing.
[9]. .S. Poomagal,T. Hamsapriya “Cosine Similarity based Page Rank Calculation” Int. J. WebScience Vol. 1. Nos. 1/2, 2011.
[10]. www.youtube.com/edureka/Introduction to Natural Language Processing.
[11]. Yue Feng , Ebrahi Bagheri , Faezah Ensan2 and Jelena Jovanovic “The state of the art in Semantic Relatedness: A framework for comparision” The knowledge engineering review 2017.