A Semantic Approach for Outlier Detection in Big Data Streams
Journal Title: Webology - Year 2019, Vol 16, Issue 1
Abstract
In recent years, the world faced a big revolution in data generation and collection technologies. The volume, velocity and veracity of data have changed drastically and led to new types of challenges related to data analysis, modeling and prediction. One of the key challenges is related to the semantic analysis of textual data especially in big data streams settings. The existing solutions focus on either topic analysis or the sentiment analysis. Moreover, the semantic outlier detection over data streams as one of the key problems in data mining and data analysis fields has less focus. In this paper, we introduce a new concept of semantic outlier through which the topic of the textual data is considered as the primary content of the data stream while the sentiment is considered as the context in which the data has been generated and affected. Also, we propose a framework for semantic outlier detection in big data streams which incorporates the contextual detection concepts. The advantage of the proposed concept is that it incorporates both topic and sentiment analysis into one single process; while at the same time the framework enables the implementation of different algorithms and approaches for semantic analysis.
Authors and Affiliations
Hussien Ahmad and Salah Dowaji
Radical Information Literacy: Reclaiming the Political Heart of the IL Movement
Andrew Whitworth, who is the Senior Lecturer in the School of Environment, Education and Development at the University of Manchester (UK) is the author of this book. He has divided the book on two major parts including “...
Efficiency of Web Crawling for Geotagged Image Retrieval
The purpose of this study was to find the efficiency of a web crawler for finding geotagged photos on the internet. We consider two alternatives: (1) extracting geolocation directly from the metadata of the image, and...
An Algorithm for Classification, Localization and Selection of Informative Features in the Space of Politypic Data
Dimensionality reduction and feature subset selection are very important and challenging issues in preliminary processing of the large amount of data for its intellectual analysis, pattern recognition and clustering. In...
Promotional Strategies for Open Access Resources Discovery and Access
The research was designed to determine the promotional strategies adopted by Nigerian university libraries to enhance open access resources discovery and access, as well as their support and perception on use of the reso...
Information and Emotion: The Emergent Affective Paradigm in Information Behavior Research and Theory.
Information behaviour is one of the most researched areas in library and information sciences and yet there are areas that have not received enough attention. This is because of the complexity of human behaviour, broadne...