An Adaptive Classification approach to filter spam E-mail using Vector Space Model

Journal Title: International Journal of Management, IT and Engineering - Year 2012, Vol 2, Issue 8

Abstract

The majority of previous studies of data mining have been concentrate on structured data, such as relational, transactional and data warehouse data. But, in actuality, an important section of the available information is stored in text databases, which consist of large collections of web documents from various sources, such as news articles, research papers, e-books, digital libraries, e-mails, and Web pages. Moreover, It is in increasing phase and in magnitude of terabytes of size. Among the ample of provisions of internet, e-mail facility is very useful and broadly used. Spam email is the strongly attached issue with email provision. Among various approaches developed to stop spam emails, filtering is an important and popular one. In this paper, to categorize spam and non-span email which arrives to our email id, classification method-KNNC Classification can work for better accuracy using Vector Space Model in adaptive manner. For getting accuracy in spam classification we have used two dataset- personal & Ling Spam Corpus(Lemm dataset) and apply KNNC Classification on them. We got nearly 95% of precision in spam & 86.6% of precision in nonspam and got 83% of accuracy using personal dataset and 80% using Lemm dataset using adaptive approach. We propose our own solution by reviewing the result and related work that adaptive approach using vector space model in KNNC classification method is efficiently provide better accuracy for filtering the spam mail for both smaller and larger dataset.

Authors and Affiliations

Nakul Dave, Uttam Chauhan and Avani Dave

Keywords

Related Articles

Working Capital Management of M/s Larsen & Toubro Ltd - An extensive study

The term Working capital is commonly used for the capital required in day to day operations, such as for purchasing raw material, for meeting day to day expenditure. Working capital refers to the circulating capital re...

slugAutonomous Network Reconfiguration System with CA-AOMDV in Wireless Mesh Networks

Wireless mesh networks (WMNs) have emerged as a key technology for next-generation wireless networking. Because of their advantages over other wireless networks, WMNs are undergoing rapid progress and inspiring numerou...

slugRetailing in India: a goldmine for foreign retailers

Retailing is the largest private industry in India and second largest employer after agriculture. The sector contributes to around 10 per cent of GDP and 6-7 percent of employment. With over 15 million retail outlets,...

E-commerce: benefits in trade & commerce, Bangladesh

The rapid growth and advancement of Information and Communication Technology makes the vast treasure of all forms of knowledge, information, invention, methodologies, techniques, process and technologies from the entir...

slugHIGHLY QUANTITATIVE MINING ASSOCIATION RULES WITH CLUSTERING

ata mining is a step in the knowledge discovery process consisting of certain data mining algorithms that, under some acceptable computational efficiency limitations, finds patterns or models in data. Association rule...

Download PDF file
  • EP ID EP18502
  • DOI -
  • Views 487
  • Downloads 18

How To Cite

Nakul Dave, Uttam Chauhan and Avani Dave (2012). An Adaptive Classification approach to filter spam E-mail using Vector Space Model. International Journal of Management, IT and Engineering, 2(8), -. https://europub.co.uk/articles/-A-18502