The Effect of Feature Selection on Phish Website Detection
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2015, Vol 6, Issue 10
Abstract
Recently, limited anti-phishing campaigns have given phishers more possibilities to bypass through their advanced deceptions. Moreover, failure to devise appropriate classification techniques to effectively identify these deceptions has degraded the detection of phishing websites. Consequently, exploiting as new; few; predictive; and effective features as possible has emerged as a key challenge to keep the detection resilient. Thus, some prior works had been carried out to investigate and apply certain selected methods to develop their own classification techniques. However, no study had generally agreed on which feature selection method that could be employed as the best assistant to enhance the classification performance. Hence, this study empirically examined these methods and their effects on classification performance. Furthermore, it recommends some promoting criteria to assess their outcomes and offers contribution on the problem at hand. Hybrid features, low and high dimensional datasets, different feature selection methods, and classification models were examined in this study. As a result, the findings displayed notably improved detection precision with low latency, as well as noteworthy gains in robustness and prediction susceptibilities. Although selecting an ideal feature subset was a challenging task, the findings retrieved from this study had provided the most advantageous feature subset as possible for robust selection and effective classification in the phishing detection domain.
Authors and Affiliations
Hiba Zuhair, Ali Selmat, Mazleena Salleh
Framework for Applicability of Agile Scrum Methodology: A Perspective of Software Industry
Agile scrum methodology has been evolved over the time largely through software industry where it has grown and developed through empirical progress. The research work presented in this paper has proposed a framework by...
SecFHIR: A Security Specification Model for Fast Healthcare Interoperability Resources
Patients taking medical treatment in distinct healthcare institutions have their information deeply fragmented between very different locations. All this information --- probably with different formats --- may be used or...
Processing Sampled Big Data
Big data processing requires extremely powerful and large computing setup. This puts bottleneck not only on processing infrastructure but also many researchers don’t get the freedom to analyze large datasets. This paper...
A Comprehensive Insight towards Research Direction in Information Propagation
The concept of Information Propagation has been studied to illustrate the particular, discrete, and explicit behavior of the nodes in a complex and highly distributed and connected networks. The complex network structure...
A Novel Information Retrieval Approach using Query Expansion and Spectral-based
Most of the information retrieval (IR) models rank the documents by computing a score using only the lexicographical query terms or frequency information of the query terms in the document. These models have a limitation...