Storage Consumption Reduction using Improved Inverted Indexing for Similarity Search on LINGO Profiles
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 5
Abstract
Millions of compounds which exist in huge datasets are represented using Simplified Molecular-Input Line- Entry System (SMILES) representation. Fragmenting SMILES strings into overlapping substrings of a defined size called LINGO Profiles avoids the otherwise time-consuming conversion process. One drawback of this process is the generation of numerous identical LINGO Profiles. Introduced by Kristensen et al, the inverted indexing approach represents a modification intended to deal with the large number of molecules residing in the database. Implementing this technique effectively reduced the storage space requirement of the dataset by half, while also achieving significant speedup and a favourable accuracy value when performing similarity searching. This report presents an in-depth analysis of results, with conclusions about the effectiveness of the working prototype for this study.
Authors and Affiliations
Muhammad Jaziem bin Mohamed Javeed, Nurul Hashimah Ahamed Hassain Malim
Crytosystem for Computer security using Iris patterns and Hetro correlators
Biometric based cryptography system provides an efficient and secure data transmission as compare to the traditional encryption system. However, it is a computationally challenge task to solve the issues to incorporate b...
Implementation of Intelligent Automated Gate System with QR Code
This paper is about QR code-based automated gate system. The aim of the research is to develop and implement a type of medium-level security gate system especially for small companies that cannot afford to install high-t...
Comparison of Localization Free Routing Protocols in Underwater Wireless Sensor Networks
Underwater Wireless Sensor Network (UWSN) is newly developed branch of Wireless Sensor network (WSN). UWSN is used for exploration of underwater resources, oceanographic data collection, flood or disaster prevention, tac...
Cloud Computing: Empirical Studies in Higher Education A Literature Review
The advent of cloud computing (CC) in recent years has attracted substantial interest from various institutions, especially higher education institutions, which wish to consider the advantages of its features. Many unive...
Analyzing Opinions and Argumentation in News Editorials and Op-Eds
Analyzing opinions and arguments in news editorials and op-eds is an interesting and a challenging task. The challenges lie in multiple levels – the text has to be analyzed in the discourse level (paragraphs and ab...