Using PCA and Factor Analysis for Dimensionality Reduction of Bio-informatics Data
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 5
Abstract
Large volume of Genomics data is produced on daily basis due to the advancement in sequencing technology. This data is of no value if it is not properly analysed. Different kinds of analytics are required to extract useful information from this raw data. Classification, Prediction, Clustering and Pattern Extraction are useful techniques of data mining. These techniques require appropriate selection of attributes of data for getting accurate results. However, Bioinformatics data is high dimensional, usually having hundreds of attributes. Such large a number of attributes affect the performance of machine learning algorithms used for classification/prediction. So, dimensionality reduction techniques are required to reduce the number of attributes that can be further used for analysis. In this paper, Principal Component Analysis and Factor Analysis are used for dimensionality reduction of Bioinformatics data. These techniques were applied on Leukaemia data set and the number of attributes was reduced from to.
Authors and Affiliations
M. Usman Ali, Shahzad Ahmed, Javed Ferzund, Atif Mehmood, Abbas Rehman
Autonomous Monitoring System using Wi-Fi Economic
In this project, it is presented the implementation of an autonomous monitoring system using solar panels and connecting to the network through Wi-Fi. The system will collect meteorological data and transmit in real-time...
A Survey of Malware Detection Techniques based on Machine Learning
Diverse malware programs are set up daily focusing on attacking computer systems without the knowledge of their users. While some authors of these programs intend to steal secret information, others try quietly to prove...
MCMC Particle Filter Using New Data Association Technique with Viterbi Filtered Gate Method for Multi-Target Tracking in Heavy Clutter
Improving data association technique in dense clutter environment for multi-target tracking used in Markov chain Monte Carlo based particle filter (MCMC-PF) are discussed in this paper. A new method named Viterbi f...
Collaborative System Model for Dynamic Planning of Supply Chain
The business need to be structured as an integrated supply chain pushes companies to make use of a greater level of co-operation and coordination. As a means of coordination, negotiation has been chosen in this work. The...
Narrowing Down Learning Research: Technical Documentation in Information Systems Research
Learning how to use technical products is of high interest for customers as well as businesses. Besides product usability, technical documentation in various forms plays a major role for the acceptance of innovative prod...