EVALUATING THE EFFECT OF DATASET SIZE ON PREDICTIVE MODEL USING SUPERVISED LEARNING TECHNIQUE

Abstract

Learning models used for prediction purposes are mostly developed without paying much cognizance to the size of datasets that can produce models of high accuracy and better generalization. Although, the general believe is that, large dataset is needed to construct a predictive learning model. To describe a data set as large in size, perhaps, is circumstance dependent, thus, what constitutes a dataset to be considered as being big or small is vague. In this paper, the ability of the predictive model to generalize with respect to a particular size of data when simulated with new untrained input is examined. The study experiments on three different sizes of data using Matlab program to create predictive models with a view to establishing if the size of data has any effect on the accuracy of a model. The simulated output of each model is measured using the Mean Absolute Error (MAE) and comparisons are made. Findings from this study reveals that, the quantity of data partitioned for the purpose of training must be of good representation of the entire sets and sufficient enough to span through the input space. The results of simulating the three network models also shows that, the learning model with the largest size of training sets appears to be the most accurate and consistently delivers a much better and stable results.

Authors and Affiliations

A. R. Ajiboye, Abdullah Arshah, H. Qin

Keywords

Related Articles

MODELLING THE UNIVERSITI MALAYSIA PAHANG EXAMINATION TIMETABLING PROBLEM

This paper presents a study of the Universiti Malaysia Pahang (UMP) examination timetabling problem and its constraints. UMP currently situated in two campuses (Gambang and Pekan) which presents many challenges in produc...

AFRICAN BUFFALO OPTIMIZATION

This is an introductory paper to the newly-designed African Buffalo Optimization (ABO) algorithm for solving combinatorial and other optimization problems. The algorithm is inspired by the behavior of African buffalos, a...

IMPLEMENTING COMBINED FSM WITH CPLDS

The subject of the research in this article is the logic circuit of the combined finite state machine (CFSM), which combines the functions of the both FSM Mealy and Moore. In practice, such a model of control automata is...

SOFTWARE AGENT AND CLOUD COMPUTING: A BRIEF REVIEW

The merging of interests between Cloud applications which necessary require an intelligent software agent with elastic, dynamic, with independent behavior ability and multi-agent systems that need consistent distributed...

A DEVELOPED NETWORK LAYER HANDOVER BASED WIRELESS NETWORKS

This paper proposes an Advanced Mobility Handover (AMH) scheme based on Wireless Local Area Networks (WLANs) by developing a network layer handover procedure which triggers messages to be sent to the next access point. T...

Download PDF file
  • EP ID EP254080
  • DOI -
  • Views 127
  • Downloads 0

How To Cite

A. R. Ajiboye, Abdullah Arshah, H. Qin (2015). EVALUATING THE EFFECT OF DATASET SIZE ON PREDICTIVE MODEL USING SUPERVISED LEARNING TECHNIQUE. International Journal of Software Engineering and Computer Systems, 1(1), 75-84. https://europub.co.uk/articles/-A-254080