International Journal of Scientific & Technology Research

IJSTR@Facebook IJSTR@Twitter IJSTR@Linkedin
Home About Us Scope Editorial Board Blog/Latest News Contact Us

IJSTR >> Volume 5 - Issue 12, December 2016 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

A Methodology In Processing Descriptive Analytics Using MMDA Traffic Update Tweets, Tokenization And Classification Tree In Discovering Knowledge

[Full Text]



Tristan Jay P. Calaguas, Menchita F. Dumlao



tokenization, classification, tweets, traffic, methodology, knowledge discovery, update traffic



Traffic on National Capital Region of the Philippines is going as one of many problems facing by the local government and Filipino citizen who are residing in Metro Manila. In addition, a Filipino citizen that is working in Metro Manila is experiencing a waste of Twenty – Eight Thousand hours in traffic which results unproductivity. Due to traffic that causes long commutes it take away an individual from exercise activities that results fatigue in their health. In relation with this, due to lack of exercise that causing by the traffic, each year, One Hundred Seventy Thousand Filipinos die from cardiovascular diseases up from Eighty Five Thousand more than Twenty years ago, according to 2009 study by the Department of Health (DOH). Population increase is one of many causes of traffic in Metro Manila. As population is growing, the more car riders and commuters volume will be in the road including delivery trucks, Pedi cabs, jeeps, and provincial buses that signify that there is a high employment rate in the country that causes traffic. However, to sustain the public needs, MMDA is the government agency that provides public services to Filipino citizens through providing updated public traffic information. For past years, MMDA used Telephony lines and Television Broadcasting for traffic information dissemination, which is very costly in maintenance that made them to adopt Twitter to post Traffic updates and advisories to the public .Since, this government agency uses Twitter in disseminating information through posting tweet, there is a need for a methodology on how these tweets will analyze so that citizens will have an insight in decision making to avoid specific time of traffic in metro manila. From this condition, the researcher will adopt the use of MMDA tweets as the primary data source and apply the CRISP as the knowledge discovery standard processes that to be used in building methodology for descriptive analytics. In this experimental research several processes were used to convert the semi structured MMDA tweets into structured data matrix. SQL was used for storing, retrieving and pattern matching, while PHP string functions were used to tokenize the tweet and transform it into array so that the tokens can store in database using iterative structure. After loading all token to its specific table we abled to have a data matrix that comprised of time, routed roads, traffic status and day information that was used in data mining to discover knowledge. Lastly we used J48 Classification Algorithm to classify the time usually the traffic happens in many routed roads from NCR. As the result we discovered that from Eight O’clock to Nine Forty One in the morning the commuters are experiencing a traffic and from One O ’Clock in the afternoon to Eight O’Clock in the evening the commuters are also experiencing a traffic in C5 North Bound to South Bound and Edsa North Bound to South Bound every Tuesday and Friday with the accuracy of 75.72%.



[1] C.S. George, “Economic Effects of Traffic in Metro Manila” http://www.businessmirror.com.ph/economic-effects-of-traffic-in-metro-manila/. 2015.

[2] “Stress, pollution, fatigue: How traffic jams affect your health” http://www.apastyle.org/learn/faqs/web-page-no-author.aspx. 2015.

[3] K. George, “What causes Fatigue? 251 Causes” http://www.healthline.com/symptom/fatigue.

[4] J.A Anne, “Cardiovascular Disease is still the country’s top killer” http://lifestyle.inquirer.net/178609/cardiovascular-disease-is-still-the-countrys-top-killer/.2014

[5] M. Izabel, “What causes traffic jams in Metro Manila?” http://www.filipinoscribe.com/2015/09/05/what-causes-the-extreme-traffic-jams-in-metro-manila/. 2015

[6] “MMDA uses Twitter for Public Service” http://www.philstar.com/networks/711633/mmda-uses-twitter-public-service. 2011

[7] R. Wirth and J. Hipp, “CRISP: DM Towards a Standard Process Model for Data Mining ” http://citeseerx.ist.psu.edu/viewdoc/download?doi=

[8] S. Robert, “Data is Useless without Meaning: The importance of insight” http://www.econtentmag.com/Articles/News/News-Feature/Data-Is-Useless-Without-Meaning-The-Importance-of-Insight-91693.htm. 2013.

[9] A. Iftikhar, K. Shah, R. Azhar, and Z. Qamruz, “Conducting Surveys and Data Collection: From Traditional Mobile and SMS –based Surveys” Pak.j.stat.oper.res. Vol. X No.2 2014 pp169 – 187, available at http://search.proquest.com/openview/920ad7e2cba14988f62f0ec260989f97/1?pq-origsite=gscholar.2014

[10] A.J. Stephen , “Business Impact of Web 2.0 Technologies” Communications of the ACM, Vol. 53 No. 12, Pages 67-79, available at http://cacm.acm.org/magazines/2010/12/102142-business-impact-of-web-2-0-technologies/fulltext.2010

[11] B. Lorenz, “The Most Important Employee Rights at the Workplace” https://www.salarium.com/important-employee-rights/. 2016

[12] G. Paul, “What Exactly Is ‘Twitter’? What is ‘Tweeting’?” https://www.lifewire.com/what-exactly-is-twitter-2483331. 2016.

[13] B. Kristian. “Understanding Sentiment Analysis: What it Is and Why It’s Used” https://www.brandwatch.com/blog/understanding-sentiment-analysis/. 2015