Feature Weight Optimization Mechanism for Email Spam Detection based on Two-step Clustering Algorithm and Logistic Regression Method

Abstract

This research proposed an improved filtering spam technique for suspected emails, messages based on feature weight and the combination of two-step clustering and logistic regression algorithm. Unique, important features are used as the optimum input for a hybrid proposed approach. This study adopted a spam detector model based on distance measure and threshold value. The aim of this model was to study and select distinct features for email filtering using feature weight method as dimension reduction. Two-step clustering algorithm was used to generate a new feature called “Label” to cluster and differentiate the diversity emails and group them based on the inter samples similarity. Thereby the spam filtering process was simplified using the Logistic regression classifier in order to distinguish the hidden patterns of spam and non-spam emails. Experimental design was conducted based on the UCI spam dataset. The outcome of the findings shows that the results of the email filtering are promising compared to other modern spam filtering methods.

Authors and Affiliations

Ahmed Hamza Osman, Hani Moetque Aljahdali

Keywords

Related Articles

Efficient Feature Selection for Product Labeling over Unstructured Data

The paper introduces a novel feature selection algorithm for labeling identical products collected from online web resources. Product labeling is important for clustering similar or same products. Products blindly crawle...

Hierarchical Cellular Structures in High-Capacity Cellular Communication Systems 

In the prevailing cellular environment, it is important to provide the resources for the fluctuating traffic demand exactly in the place and at the time where and when they are needed. In this paper, we explored the abil...

Smart Or BAC

The emergence of the Internet of Things (IoT) paradigm, provides a huge scope for more streamlined living through an increase of smart services but this coincides with an increase in security and privacy concerns, theref...

QoS-based Cloud Manufacturing Service Composition using Ant Colony Optimization Algorithm

Cloud manufacturing (CMfg) is a service-oriented platform that enables engineers to use the manufacturing capacity in the form of cloud-based services that aggregated in service pools on demand. In CMfg, the integration...

Middleware to integrate heterogeneous Learning Management Systems and initial results

The use of the Learning Management Systems (LMS) has been increased. It is desirable to access multiple learning objects that are managed by Learning Management Systems. The diversity of LMS allow us to consider them as...

Download PDF file
  • EP ID EP262386
  • DOI 10.14569/IJACSA.2017.081054
  • Views 88
  • Downloads 0

How To Cite

Ahmed Hamza Osman, Hani Moetque Aljahdali (2017). Feature Weight Optimization Mechanism for Email Spam Detection based on Two-step Clustering Algorithm and Logistic Regression Method. International Journal of Advanced Computer Science & Applications, 8(10), 420-429. https://europub.co.uk/articles/-A-262386