Feature Selection Technique for improving classification performance in the web-phishing detection process

Authors

  • Anggit Ferdita Nugraha University of Amikom Yogyakarta
  • Dwiky Alfian Tama University of Amikom Yogyakarta
  • Dewi Anisa Istiqomah University of Amikom Yogyakarta
  • Surya Tri Atmaja Ramadhani University of Amikom Yogyakarta
  • Bayu Nadya Kusuma University of Amikom Yogyakarta
  • Vikky Aprelia Windarni University of Amikom Yogyakarta

DOI:

https://doi.org/10.34306/conferenceseries.v4i1.667

Keywords:

Web-phishing, Feature Selection, Pearson correlation, Classification

Abstract

Web phishing is a type of cybercrime that occasionally threatens the online activities of website visitors. Web phishing uses a phoney website page that closely mimics the legitimate Website in order to fool its target into providing crucial information. Web phishing attacks also continue to grow in popularity year after year. As a result, it is vital to design a web phishing detection system in order to reduce the number of victims and financial losses caused by web phishing attacks. The development of a web phishing detection system continues to this day, with machine learning being the most often used model. Unfortunately, the construction of a machine learning-based web phishing detection system frequently employs only a single classification step; however, the feature selection process enables an increase in the performance of the resultant classification. Thus, an experiment was conducted in this paper by using a feature selection procedure based on the Pearson correlation algorithm prior to doing machine learning modelling utilizing popular algorithms such as Naive Bayes, Decision Tree, and Random Forest. As a result, using a web phishing dataset from the UCI Machine Learning Repository, it was determined that the addition of the feature selection process based on the use of decision tree and random forest algorithms resulted in an increase in accuracy of up to 94.60 percent and 95.50 percent, respectively, and a slight decrease in accuracy of 0.4 percent when implemented in the Naive Bayes algorithm.

Downloads

Download data is not yet available.

Downloads

Published

2022-01-25

How to Cite

Nugraha, A. F. ., Tama, D. A. ., Istiqomah, D. A. ., Ramadhani, S. T. A. ., Kusuma, B. N. ., & Windarni, V. A. . (2022). Feature Selection Technique for improving classification performance in the web-phishing detection process. Conference Series, 4(1), 25–31. https://doi.org/10.34306/conferenceseries.v4i1.667