Reducing Dimensionality in Text Mining using Conjugate Gradients and Hybrid Cholesky Decomposition

Abstract

Generally, data mining in larger datasets consists of certain limitations in identifying the relevant datasets for the given queries. The limitations include: lack of interaction in the required objective space, inability to handle the data sets or discrete variables in datasets, especially in the presence of missing variables and inability to classify the records as per the given query, and finally poor generation of explicit knowledge for a query increases the dimensionality of the data. Hence, this paper aims at resolving the problems with increasing data dimensionality in datasets using modified non-integer matrix factorization (NMF). Further, the increased dimensionality arising due to non-orthogonally of NMF is resolved with Cholesky decomposition (cdNMF). Initially, the structuring of datasets is carried out to form a well-defined geometric structure. Further, the complex conjugate values are extracted and conjugate gradient algorithm is applied to reduce the sparse matrix from the data vector. The cdNMF is used to extract the feature vector from the dataset and the data vector is linearly mapped from upper triangular matrix obtained from the Cholesky decomposition. The experiment is validated against accuracy and normalized mutual information (NMI) metrics over three text databases of varied patterns. Further, the results prove that the proposed technique fits well with larger instances in finding the documents as per the query, than NMF, neighborhood preserving: nonnegative matrix factorization (NPNMF), multiple manifolds non-negative matrix factorization (MMNMF), robust non-negative matrix factorization (RNMF), graph regularized non-negative matrix factorization (GNMF), hierarchical non-negative matrix factorization (HNMF) and cdNMF.

Authors and Affiliations

Jasem M. Alostad

Keywords

Related Articles

Attendance and Information System using RFID and Web-Based Application for Academic Sector

Recently, students attendance have been considered as one of the crucial elements or issues that reflects the academic achievements and the performance contributed to any university compared to the traditional methods th...

Comprehensive Understanding of Intelligent User Interfaces

This paper represents basic discussion for one of the latest advances in the technology, known as Intelligent User Interface (IIUI) which is a combination of two major fields of computer science, namely, HCI & Artificial...

Intrusion Detection Techniques in Wireless Sensor Network using Data Mining Algorithms: Comparative Evaluation Based on Attacks Detection

Wireless sensor network (WSN) consists of sensor nodes. Deployed in the open area, and characterized by constrained resources, WSN suffers from several attacks, intrusion and security vulnerabilities. Intrusion detection...

Model of Temperature Dependence Shape of Ytterbium -doped Fiber Amplifier Operating at 915 nm Pumping Configuration

We numerically analyze the temperature dependence of an ytterbium-doped fiber amplifier (YDFA) operating at 915 nm, investigating its gain and Noise Figure properties variation with temperature. The temperature-dependent...

A New Mixed Signal Platform to Study the Accuracy/Complexity Trade-Off of DPD Algorithms

The increase in bandwidth of Power Amplifier (PA) input signals has led to the development of more complex behavioral PA models. Most recent models such as the Generalized Memory Polynomial (1) or the Polyharmonic distor...

Download PDF file
  • EP ID EP260200
  • DOI 10.14569/IJACSA.2017.080716
  • Views 84
  • Downloads 0

How To Cite

Jasem M. Alostad (2017). Reducing Dimensionality in Text Mining using Conjugate Gradients and Hybrid Cholesky Decomposition. International Journal of Advanced Computer Science & Applications, 8(7), 110-116. https://europub.co.uk/articles/-A-260200