Using PCA and Factor Analysis for Dimensionality Reduction of Bio-informatics Data

Abstract

Large volume of Genomics data is produced on daily basis due to the advancement in sequencing technology. This data is of no value if it is not properly analysed. Different kinds of analytics are required to extract useful information from this raw data. Classification, Prediction, Clustering and Pattern Extraction are useful techniques of data mining. These techniques require appropriate selection of attributes of data for getting accurate results. However, Bioinformatics data is high dimensional, usually having hundreds of attributes. Such large a number of attributes affect the performance of machine learning algorithms used for classification/prediction. So, dimensionality reduction techniques are required to reduce the number of attributes that can be further used for analysis. In this paper, Principal Component Analysis and Factor Analysis are used for dimensionality reduction of Bioinformatics data. These techniques were applied on Leukaemia data set and the number of attributes was reduced from to.

Authors and Affiliations

M. Usman Ali, Shahzad Ahmed, Javed Ferzund, Atif Mehmood, Abbas Rehman

Keywords

Related Articles

A Correlation based Approach to Differentiate between an Event and Noise in Internet of Things

Internet of Things (IoT) is considered a huge enhancement in the field of information technology. IoT is the integration of physical devices which are embedded with electronics, software, sensors, and connectivity that a...

A Semantic Learning Object (SLO)Web-Editor based on Web Ontology Language (OWL) using a New OWL2XSLO Approach

Today, we see a strong demand for real-time information, with a rapid growth of m-learning. We also see that there are many educational resources on the Internet. Learning objects (LOs) are designed as a means of reusing...

Smart Jamming Attacks in Wireless Networks During a Transmission Cycle: Stackelberg Game with Hierarchical Learning Solution

Due to the broadcast nature of the shared medium, wireless communications become more vulnerable to malicious attacks. In this paper, we tackle the problem of jamming in wireless network when the transmission of the jamm...

An Evolutionary Stochastic Approach for Efficient Image Retrieval using Modified Particle Swarm Optimization

Image retrieval system as a reliable tool can help people in reaching efficient use of digital image accumulation; also finding efficient methods for the retrieval of images is important. Color and texture descriptors ar...

Automatic Keyphrase Extractor from Arabic Documents

The keyphrase is a sentence or a part of a sentence that contains a sequence of words that expresses the meaning and the purpose of any given paragraph. Keyphrase extraction is the task of identifying the possible keyphr...

Download PDF file
  • EP ID EP259091
  • DOI 10.14569/IJACSA.2017.080551
  • Views 70
  • Downloads 0

How To Cite

M. Usman Ali, Shahzad Ahmed, Javed Ferzund, Atif Mehmood, Abbas Rehman (2017). Using PCA and Factor Analysis for Dimensionality Reduction of Bio-informatics Data. International Journal of Advanced Computer Science & Applications, 8(5), 415-426. https://europub.co.uk/articles/-A-259091