The Informative Vector Selection in Active Learning using Divisive Analysis
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 10
Abstract
Traditional supervised machine learning techniques require training on large volumes of data to acquire efficiency and accuracy. As opposed to traditional systems Active Learning systems minimizes the size of training data significantly because the selection of the data is done based on a strong mathematical model. This helps in achieving the same accuracy levels of the results as baseline techniques but with a considerably small training dataset. In this paper, the active learning approach has been implemented with a modification into the traditional system of active learning with version space algorithm. The version space concept is replaced with the divisive analysis (DIANA) algorithm and the core idea is to pre-cluster the instances before distributing them into training and testing data. The results obtained by our system have justified our reasoning that pre-clustering instead of the traditional version space algorithm can bring a good impact on the accuracy of the overall system’s classification. Two types of data have been tested, the binary class and multi-class. The proposed system worked well on the multi-class but in case of binary, the version space algorithm results were more accurate.
Authors and Affiliations
Zareen Sharf, Maryam Razzak
Transforming Service Delivery with TOGAF and Archimate in a Government Agency in Peru
The application of The Open Group Architecture Framework (TOGAF) and Archimate to transform the citizen's service delivery by the Ministry of Labor and Employment Promotion of Peru is presented. The enterprise architectu...
Cadastral and Tea Production Management System with Wireless Sensor Network, GIS based System and IoT Technology
Cadastral and tea production management system utilizing wireless sensor network of Internet of Things (IoT) technology is proposed. To improve efficiency of tea productions, cadastral management and tea production proce...
A Routing Calculus with Distance Vector Routing Updates
We propose a routing calculus in a process algebraic framework to implement dynamic updates of routing table using distance vector routing. This calculus is an extension of an existing routing calculus DRωπ where routing...
Data Mining in Education
Data mining techniques are used to extract useful knowledge from raw data. The extracted knowledge is valuable and significantly affects the decision maker. Educational data mining (EDM) is a method for extracting useful...
Effective Listings of Function Stop words for Twitter
Many words in documents recur very frequently but are essentially meaningless as they are used to join words together in a sentence. It is commonly understood that stop words do not contribute to the context or content o...