Using Unlabeled Data to Improve Inductive Models by Incorporating Transductive Models

Abstract

 This paper shows how to use labeled and unlabeled data to improve inductive models with the help of transductivemodels.We proposed a solution for the self-training scenario. Self- training is an effective semi-supervised wrapper method which can generalize any type of supervised inductive model to the semi-supervised settings. it iteratively refines a inductive model by bootstrap from unlabeled data. Standard self-training uses the classifier model(trained on labeled examples) to label and select candidates from the unlabeled training set, which may be problematic since the initial classifier may not be able to provide highly confident predictions as labeled training data is always rare. As a result, it could always suffer from introducing too much wrongly labeled candidates to the labeled training set, which may severely degrades performance. To tackle this problem, we propose a novel self-training style algorithm which incorporate a graph-based transductive model in the self-labeling process. Unlike standard self-training, our algorithm utilizes labeled and unlabeled data as a whole to label and select unlabeled examples for training set augmentation. A robust transductive model based on graph markov random walk is proposed, which exploits manifold assumption to output reliable predictions on unlabeled data using noisy labeled examples. The proposed algorithm can greatly minimize the risk of performance degradation due to accumulated noise in the training set. Experiments show that the proposed algorithm can effectively utilize unlabeled data to improve classification performance.

Authors and Affiliations

ShengJun Cheng, Jiafeng Liu, XiangLong Tang

Keywords

Related Articles

 Improved Framework for Breast Cancer Detection using Hybrid Feature Extraction Technique and FFNN

 Breast Cancer early detection using terminologies of image processing is suffered from the less accuracy performance in different automated medical tools. To improve the accuracy, still there are many research stud...

 Developing a Mathematical Model to Detect Diabetes Using Multigene Genetic Programming

 Diabetes Mellitus is one of the deadly diseases growing at a rapid rate in the developing countries. Diabetes Mellitus is being one of the major contributors to the mortality rate. It is the sixth reason for death...

 Optimum Band and Band Combination for Retrieving Total Nitrogen, Water, Fiber Content in Tealeaves Through Remote Sensing Based on Regressive Analysis

 Optimum band and band combination for retrieving total nitrogen, water and fiber content in tealeaves with remote sensing data is investigated based on regressive analysis. Based on actual measured data of total ni...

 Method for Psychological Status Monitoring with Line of Sight Vector Changes (Human Eye Movements) Detected with Wearing Glass

 Method for psychological status monitoring with line of sight vector changes (human eye movement) detected with wearing glass is proposed. Succored eye movement should be an indicator of humans’ psychological statu...

 Appropriate Tealeaf Harvest Timing Determination Referring Fiber Content in Tealeaf Derived from Ground based Nir Camera Images

 Method for most appropriate tealeaves harvest timing with the reference to the fiber content in tealeaves which can be estimated with ground based Near Infrared (NIR) camera images is proposed. In the proposed meth...

Download PDF file
  • EP ID EP131476
  • DOI 10.14569/IJARAI.2014.030207
  • Views 102
  • Downloads 0

How To Cite

ShengJun Cheng, Jiafeng Liu, XiangLong Tang (2014).  Using Unlabeled Data to Improve Inductive Models by Incorporating Transductive Models. International Journal of Advanced Research in Artificial Intelligence(IJARAI), 3(2), 32-38. https://europub.co.uk/articles/-A-131476