An Adaptive Classification approach to filter spam E-mail using Vector Space Model

Journal Title: International Journal of Management, IT and Engineering - Year 2012, Vol 2, Issue 8

Abstract

The majority of previous studies of data mining have been concentrate on structured data, such as relational, transactional and data warehouse data. But, in actuality, an important section of the available information is stored in text databases, which consist of large collections of web documents from various sources, such as news articles, research papers, e-books, digital libraries, e-mails, and Web pages. Moreover, It is in increasing phase and in magnitude of terabytes of size. Among the ample of provisions of internet, e-mail facility is very useful and broadly used. Spam email is the strongly attached issue with email provision. Among various approaches developed to stop spam emails, filtering is an important and popular one. In this paper, to categorize spam and non-span email which arrives to our email id, classification method-KNNC Classification can work for better accuracy using Vector Space Model in adaptive manner. For getting accuracy in spam classification we have used two dataset- personal & Ling Spam Corpus(Lemm dataset) and apply KNNC Classification on them. We got nearly 95% of precision in spam & 86.6% of precision in nonspam and got 83% of accuracy using personal dataset and 80% using Lemm dataset using adaptive approach. We propose our own solution by reviewing the result and related work that adaptive approach using vector space model in KNNC classification method is efficiently provide better accuracy for filtering the spam mail for both smaller and larger dataset.

Authors and Affiliations

Nakul Dave, Uttam Chauhan and Avani Dave

Keywords

Related Articles

slugHANDLING OF SYNCHRONIZED DATA USING JAVA/J2EE

This paper proposes use of a Vector Data Structure mechanism for Human Resource Management System over the Web Application. Array is the static memory allocation. It allocates the memory for the same data type in seque...

A study on the most influential type of stress among the people in various age groups and on the most common technique used by them to manage the stress

Stress is body's response to certain situations. Something that is stressful for you may not be stressful for someone else. There are many different kinds of stress and not all of them are bad. Stress can help one to a...

A Study on Liquidity, Profitability and Working Capital Management of CoOperative Dudh Sangh

Working capital management involves the management and control of the gross current assets. Its effective management provision can do much more to the success of the business. Its inefficient management can lead not on...

slugThe Moderating Role of Supporting Technology on the Relationship between Firm Integration and Supply Chain Orientation

The purpose of this research is to present the relationship between firm integration and supply chain orientation and supporting technology as moderating that relationship. The data collection instrument used was a que...

COMPACT DESIGN AND SIMULATION OF LOW PASS MICROWAVE FILTER ON MICROSTRIP TRANSMISSION LINE AT 2.4 GHz

This paper presents a frequency responsive 5 pole microstrip Low Pass filter at 2.4 GHz with Z0 = 50 ohm and passband ripple LAR = 0.04321dB. This presents a novel design of Chebyshev LPF prototype with substrate thic...

Download PDF file
  • EP ID EP18502
  • DOI -
  • Views 499
  • Downloads 18

How To Cite

Nakul Dave, Uttam Chauhan and Avani Dave (2012). An Adaptive Classification approach to filter spam E-mail using Vector Space Model. International Journal of Management, IT and Engineering, 2(8), -. https://europub.co.uk/articles/-A-18502