International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 9 - Issue 8, August 2020 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Sundanese Language Level Detection Using Rule-Based Classification: Case Studies On Twitter

[Full Text]



Ade Sutedi, Dede Kurniadi, Wiyoga Baswardono



Classification, Language levels, Sundanese, Twitter.



Along with the history of the Sundanese tradition, language has an important role to show the existence of Sundanese culture, especially in Banten and West Java. Today, the use of Sundanese language are decreased due to a competition of regional languages with national languages even with foreign languages. In addition, the divergence of language in the society cause disparities between young people and older people. The native speaker are reduced due to social developments in society that are increasingly wide open. This issue becomes popular in the last decade due to the death of language especially for regional language. To discover the existence of Sundanese language in social media today, Twitter was used to analyze as a parameter that indicate the existence of a Sundanese language used by the people. The objectives of this research are: (1) to identified the existence of Sundanese language in social media; (2) to classify the word levels of Sundanese language used and comparing their levels to get the summary of characteristic of Sundanese language for every region. In this research, classification process taken from Sundanese vocabulary which divided into three levels: Ribaldry level (Loma), Standard level (Hormat ka sorangan), and Polite level (Hormat ka batur). Classification involves the word n-grams (unigram, bigram, and trigram) features with rule-based classification to determine Sundanese and non Sundanese language with their levels. In this research, the data was retrieved from Twitter user based on their region especially in Banten and West Java provinces. The result shows that the use of Sundanese language among the people still exists and also used and in social media with ribaldry level dominated. Prediction score for several feature is smaller than previous research. But, we consider the precision value of the experimental results obtained score 0.841 which can be used to determine the predictive value close to the actual positive value.



[1] C. Sobarna, “Bahasa Sunda Sudah Di Ambang Pintu Kematiankah?,” Makara Hum. Behav. Stud. Asia, vol. 11, no. 1, p. 13, 2007.
[2] S. D. Budiwati and N. N. Setiawan, “Experiment on building Sundanese lexical database based on WordNet,” J. Phys. Conf. Ser., vol. 971, no. 1, 2018.
[3] E. Z. Arifin, “Bahasa Sunda Dialek Priangan,” Pujangga, vol. 2, no. 1, pp. 1–44, 2016.
[4] E. Karlieni, A. Hamid, and T. Prabasmoro, “The Role of Sundanese Language in Therapeutic Communication The Oncology Clininc RSHS,” Int. Semin. Lang. Maint. Shift, vol. 549, pp. 542–549, 2017.
[5] A. C. Juwita, A. Kalimah, B. Sunda, and D. Twitter, “Agustina Chandra Juwita, 2014 Adegan Kalimah Basa Sunda Dina Twitter Universitas Pendidikan Indonesia | repository.upi.edu | perpustakaan.upi.edu,” pp. 56–58, 2014.
[6] I. Baidillah et al., “Direktori Aksara Sunda untuk Unicode,” p. 131, 2008.
[7] A. Purwoko, “Model stemming berbasis kamus untuk berbasis kamus untuk dokumen berbahasa sunda,” pp. 1–74, 2011.
[8] T. F. Djajasudarma, D. Indira, and T. Muhtadin, “Fiksimini Berbahasa Sunda dalam Media Sosial (Sundanese Minifiction in Social Media),” J. Komunikasi, Malaysian J. Commun., vol. 34, no. 2, pp. 293–308, 2018.
[9] Maman Sumantri, Atjep Djamaludin, Achmad Patoni, R. H. Moch. Koerdie, M. O. Koesman, and Epa Sjafei Adisastra, Kamus Sunda - Indonesia. 1985.
[10] A. Al-Thubaity, M. Alhoshan, and I. Hazzaa, “Using Word N-Grams as Features in Arabic Text Classification,” 2015, pp. 35–43.
[11] A. R. Atmadja and A. Purwarianti, “Comparison on the rule based method and statistical based method on emotion classification for Indonesian Twitter text,” 2015 Int. Conf. Inf. Technol. Syst. Innov. ICITSI 2015 - Proc., no. January 2018, 2016.
[13] B. E. Parilla-ferrer, P. L. F. Jr, and J. T. B. Iv, “Automatic Classification of Disaster-Related Tweets,” 2015.
[14] S. Hamidian and M. T. Diab, “Rumor Detection and Classification for Twitter Data,” 2019.
[15] M. Z. Asghar, A. Khan, A. Bibi, F. M. Kundi, and H. Ahmad, “Sentence-Level Emotion Detection Framework Using Rule-Based Classification,” Cognit. Comput., vol. 9, no. 6, pp. 868–894, 2017.
[16] G. Xiao, Q. Cheng, and C. Zhang, “Detecting Travel Modes Using Rule-Based Classification System and Gaussian Process Classifier,” IEEE Access, vol. 7, pp. 116741–116752, 2019.
[17] M. E. Karar, S. H. El-Khafif, and M. A. El-Brawany, “Automated Diagnosis of Heart Sounds Using Rule-Based Classification Tree,” J. Med. Syst., vol. 41, no. 4, 2017.
[18] Tan P-N, Steinbach M and Kumar V 2006 An introduction to data mining: solution Manual
[19] https://tanahair.indonesia.go.id/portal-web/inageoportal/#/webmapid=215da7bb-b69b-448b-b018-0c1f17127f65 Acessed: 2 February 2020
[20] https://simplemaps.com/data/id-cities. Acessed: 27 January 2020
[21] Mulyanah A. Republika Online. [Online]. 2013 [cited 2020. Available from: http://nasional.republika.co.id/berita/nasional/jawa-barat-nasional/13/08/26/ms4nkw-bahasa-sunda-terancam-punah.