Feature Weight Optimization Mechanism for Email Spam Detection based on Two-step Clustering Algorithm and Logistic Regression Method

Abstract

This research proposed an improved filtering spam technique for suspected emails, messages based on feature weight and the combination of two-step clustering and logistic regression algorithm. Unique, important features are used as the optimum input for a hybrid proposed approach. This study adopted a spam detector model based on distance measure and threshold value. The aim of this model was to study and select distinct features for email filtering using feature weight method as dimension reduction. Two-step clustering algorithm was used to generate a new feature called “Label” to cluster and differentiate the diversity emails and group them based on the inter samples similarity. Thereby the spam filtering process was simplified using the Logistic regression classifier in order to distinguish the hidden patterns of spam and non-spam emails. Experimental design was conducted based on the UCI spam dataset. The outcome of the findings shows that the results of the email filtering are promising compared to other modern spam filtering methods.

Authors and Affiliations

Ahmed Hamza Osman, Hani Moetque Aljahdali

Keywords

Related Articles

SYNTHETIC TEMPLATE: EFFECTIVE TOOL FOR TARGET CLASSIFICATION AND MACHINE VISION

A process for replacing a voluminous image dictionary, which characterizes a certain target of interest in a constrained zone of effectiveness representing controlled states including scale and view angle, with a synthet...

IRPanet: Intelligent Routing Protocol in VANET for Dynamic Route Optimization

This paper presents novel routing protocol, IRPANET (Intelligent Routing Protocol in VANET) for Vehicular Adhoc Network (VANET). Vehicular Ad Hoc Networks are special class of Mobile Adhoc Network, created by road vehicl...

An Intelligent Semi-Latin Square Construct for Measuring Human Capital Intelligence in Recruitment

Processing speed and memory recall ability are two major Human Capital Intelligence attributes required for recruitment. Matzel identified five domains of Intelligence. Unfortunately, there were no stated means for measu...

Thai Agriculture Products Traceability System using Blockchain and Internet of Things

In this paper, we successfully designed and de-veloped Thai agriculture products traceability system using blockchain and Internet of Things. Blockchain, which is the distributed database, is used for our proposed tracea...

Smart Card ID: An Evolving and Viable Technology

In today’s world carrying a number of plastic smart cards to establish our identity has become an integral segment of our routine lives. Identity establishment necessitates a pre stored readily available data about self...

Download PDF file
  • EP ID EP262386
  • DOI 10.14569/IJACSA.2017.081054
  • Views 63
  • Downloads 0

How To Cite

Ahmed Hamza Osman, Hani Moetque Aljahdali (2017). Feature Weight Optimization Mechanism for Email Spam Detection based on Two-step Clustering Algorithm and Logistic Regression Method. International Journal of Advanced Computer Science & Applications, 8(10), 420-429. https://europub.co.uk/articles/-A-262386