A Novel Approach For Generating Rules For SMS Spam Filtering Using Rough Sets
Ashima Wadhawan, Neerja Negi
Index Terms: Bayesian Filtering, Classification, Checksum Filter,Content Based Filtering Heuristic Filtering, Rough set,SMS Spam Filtering
Abstract: Spam is defined as unwanted commercial messages to many recipients. Email Spamming is a universal problem with which everyone is familiar. This problem has reached to the mobile networks also now days to a great extent which is referred to as SMS Spamming. A number of approaches are used for SMS spam filtering like blacklist-white list filter, Content based filter, Bayesian filtering, checksum filter, heuristic filter. The most common filtering technique is content based spam filtering which uses actual text of messages to determine whether it is spam or not. Bayesian method represents the changing nature of message using probability theory. Bayesian classifier can be trained very efficiently in supervised learning. We have used a new mathematical approach Rough set Theory. Rough Set Theory is a new methodology which is used to cluster the objects of a decision system with a large data set. In this dissertation, the Naïve Bayes and the RST method are implemented.
 Sarah Jane Delany , Mark Buckley , Derek Greene, “Sms Spam Filtering: methods and data” , Expert Systems with Applications , proc ELSEVIER 2012 pp 899–908.
 Noemí Pérez-Díaz, David Ruano-Ordás, Florentino Fdez-Riverola, José R. Méndez, “SDAI: An integral evaluation methodology for content-based spam filtering models” ,Expert Systems with Applications ,proc ELSEVIER 2012 pp487–500
 Zbigniew Suraj “An Introduction to Rough Set Theory and Its Applications : A tutorial”proc ICENCO 2004, Cairo, Egypt pp27-30
 José María Gómez Hidalgo, Guillermo Cajigas Bringas, Enrique Puertas Sánz , “Content Based SMS Spam Filtering”, proc ACM pp373-380
 A.K Uysal,S.Ergin, E. Sora Gunal, “The Impact of Feature Extraction and Selection on SMS Spam Filtering” ,proc.IEEE,2010, pp-1392-1412
 Tiago A. Almeida, José María Gómez Hidalgo, Tiago P. Silva” Towards SMS Spam Filtering: Results under a New Dataset” proc INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE pp 1-18
 Jan Komorowski, Lech Polkowski and Anderzej Skowron, “Rough Sets: A Tutorial”,pp 1-8
 Zdzisław Pawlak and Andrzej Skowron Information Sciences, “Rudiments of Rough Sets”, pp 3 -27, 2007