Voting-based Classification for E-mail Spam Detection
Journal Title: Journal of ICT Research and Applications - Year 2016, Vol 10, Issue 1
Abstract
The problem of spam e-mail has gained a tremendous amount of attention. Although entities tend to use e-mail spam filter applications to filter out received spam e-mails, marketing companies still tend to send unsolicited e-mails in bulk and users still receive a reasonable amount of spam e-mail despite those filtering applications. This work proposes a new method for classifying e-mails into spam and non-spam. First, several e-mail content features are extracted and then those features are used for classifying each e-mail individually. The classification results of three different classifiers (i.e. Decision Trees, Random Forests and k-Nearest Neighbor) are combined in various voting schemes (i.e. majority vote, average probability, product of probabilities, minimum probability and maximum probability) for making the final decision. To validate our method, two different spam e-mail collections were used.
Authors and Affiliations
Bashar Al-Shboul
Efficient CFO Compensation Method in Uplink OFDMA for Mobile WiMax
Mobile WiMax uses Orthogonal Frequency Division Multiple Access (OFDMA) in uplink where synchronization is a complex task as each user presents a different carrier frequency offset (CFO). In the Data Aided Phase Incremen...
Improvement of Fuzzy Geographically Weighted Clustering-Ant Colony Optimization Performance using Context-Based Clustering and CUDA Parallel Programming
Geo-demographic analysis (GDA) is the study of population characteristics by geographical area. Fuzzy Geographically Weighted Clustering (FGWC) is an effective algorithm used in GDA. Improvement of FGWC has been done by...
Randomized Symmetric Crypto Spatial Fusion Steganographic System
The image fusion steganographic system embeds encrypted messages in decomposed multimedia carriers using a pseudorandom generator but it fails to evaluate the contents of the cover image. This results in the secret data...
VLSI Architecture for Configurable and Low-Complexity Design of Hard-Decision Viterbi Decoding Algorithm
Convolutional encoding and data decoding are fundamental processes in convolutional error correction. One of the most popular error correction methods in decoding is the Viterbi algorithm. It is extensively implemented i...
Passive Available Bandwidth Estimation Based on Collision Probability and Node State Synchronization in Wireless Networks
In wireless networks, available bandwidth estimation is challenging because wireless channels are used by multiple users or applications concurrently. In this study, we propose a passive measurement scheme to estimate th...