USING PENALIZED REGRESSION WITH PARALLEL COORDINATES FOR VISUALIZATION OF SIGNIFICANCE IN HIGH DIMENSIONAL DATA
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2013, Vol 4, Issue 10
Abstract
In recent years, there has been an exponential increase in the amount of data being produced and disseminated by diverse applications, intensifying the need for the development of effective methods for the interactive visual and analytical exploration of large, high-dimensional datasets. In this paper, we describe the development of a novel tool for multivariate data visualization and exploration based on the integrated use of regression analysis and advanced parallel coordinates visualization. Conventional parallel-coordinates visualization is a classical method for presenting raw multivariate data on a 2D screen. However, current tools suffer from a variety of problems when applied to massively high-dimensional datasets. Our system tackles these issues through the combined use of regression analysis and a variety of enhancements to traditional parallel-coordinates display capabilities, including new techniques to handle visual clutter, and intuitive solutions for selecting, ordering, and grouping dimensions. We demonstrate the effectiveness of our system through two case-studies.
Authors and Affiliations
Shengwen Wang, Yi Yang, Jih-Sheng Chang, Fang-Pang Lin
QoS-based Cloud Manufacturing Service Composition using Ant Colony Optimization Algorithm
Cloud manufacturing (CMfg) is a service-oriented platform that enables engineers to use the manufacturing capacity in the form of cloud-based services that aggregated in service pools on demand. In CMfg, the integration...
Transfer Learning Method Using Ontology for Heterogeneous Multi-agent Reinforcement Learning
This paper presents a framework, called the knowledge co-creation framework (KCF), for heterogeneous multiagent robot systems that use a transfer learning method. A multiagent robot system (MARS) that utilizes reinforcem...
Using Game Theory to Handle Missing Data at Prediction Time of ID3 and C4.5 Algorithms
The raw material of our paper is a well known and commonly used type of supervised algorithms: decision trees. Using a training data, they provide some useful rules to classify new data sets. But a data set with missing...
Image Retrieval System based on Color Global and Local Features Combined with GLCM for Texture Features
In CBIR (content-based image retrieval) features are extracted based on color, texture, and shape. There are many factors affecting the accuracy (precision) of retrieval such as number of features, type of features (loca...
Storage Consumption Reduction using Improved Inverted Indexing for Similarity Search on LINGO Profiles
Millions of compounds which exist in huge datasets are represented using Simplified Molecular-Input Line- Entry System (SMILES) representation. Fragmenting SMILES strings into overlapping substrings of a defined size cal...