Reducing Dimensionality in Text Mining using Conjugate Gradients and Hybrid Cholesky Decomposition

Abstract

Generally, data mining in larger datasets consists of certain limitations in identifying the relevant datasets for the given queries. The limitations include: lack of interaction in the required objective space, inability to handle the data sets or discrete variables in datasets, especially in the presence of missing variables and inability to classify the records as per the given query, and finally poor generation of explicit knowledge for a query increases the dimensionality of the data. Hence, this paper aims at resolving the problems with increasing data dimensionality in datasets using modified non-integer matrix factorization (NMF). Further, the increased dimensionality arising due to non-orthogonally of NMF is resolved with Cholesky decomposition (cdNMF). Initially, the structuring of datasets is carried out to form a well-defined geometric structure. Further, the complex conjugate values are extracted and conjugate gradient algorithm is applied to reduce the sparse matrix from the data vector. The cdNMF is used to extract the feature vector from the dataset and the data vector is linearly mapped from upper triangular matrix obtained from the Cholesky decomposition. The experiment is validated against accuracy and normalized mutual information (NMI) metrics over three text databases of varied patterns. Further, the results prove that the proposed technique fits well with larger instances in finding the documents as per the query, than NMF, neighborhood preserving: nonnegative matrix factorization (NPNMF), multiple manifolds non-negative matrix factorization (MMNMF), robust non-negative matrix factorization (RNMF), graph regularized non-negative matrix factorization (GNMF), hierarchical non-negative matrix factorization (HNMF) and cdNMF.

Authors and Affiliations

Jasem M. Alostad

Keywords

Related Articles

Neutrosophic Relational Database Decomposition 

In this paper we present a method of decomposing a neutrosophic database relation with Neutrosophic attributes into basic relational form. Our objective is capable of manipulating incomplete as well as inconsistent infor...

SOM Based Visualization Technique For Detection Of Cancerous Masses In Mammogram 

Breast cancer is the most common form of cancer in women. An intelligent computer-aided diagnosis system can be very helpful for radiologist in detecting and diagnosing micro calcifications patterns earlier and faster th...

Menu Positioning on Web Pages. Does it Matter?

This paper concerns an investigation by the authors into the efficiency and user opinions of menu positioning in web pages. While the idea and use of menus on web pages is not new, the authors feel there is not enough em...

Performance Evaluation of Network Gateway Design for NoC based System on FPGA Platform

Network on Chip (NoC) is an emerging interconnect solution with reliable and scalable features over the System on Chip (SoC) and helps to overcome the drawbacks of bus-based interconnection in SoC. The multiple cores or...

Implementation of Novel Medical Image Compression Using Artificial Intelligence

The medical image processing process is one of the most important areas of research in medical applications in digitized medical information. A medical images have a large sizes. Since the coming of digital medical infor...

Download PDF file
  • EP ID EP260200
  • DOI 10.14569/IJACSA.2017.080716
  • Views 95
  • Downloads 0

How To Cite

Jasem M. Alostad (2017). Reducing Dimensionality in Text Mining using Conjugate Gradients and Hybrid Cholesky Decomposition. International Journal of Advanced Computer Science & Applications, 8(7), 110-116. https://europub.co.uk/articles/-A-260200