A Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 1
Abstract
Web usage mining(WUM) , also known as Web Log Mining is the application of Data Mining techniques, which are applied on large volume of data to extract useful and interesting user behaviour patterns from web logs, in order to improve web based applications. This paper aims to improve the data discovery by mining the usage data from log files. In this paper the work is done in three phases. First and second phase0 which are data cleaning and user identification respectively are completed using traditional methods. The third phase, session identification is done using three different methods. The main focus of this paper is on sessionization of log file which is a critical step for extracting usage patterns. The proposed referrer-time and Semantically-time-referrer methods overcome the limitations of traditional methods. The main advantage of pre-processing model presented in this paper over other methods is that it can process text or excel log file of any format. The experiments are performed on three different log files which indicate that the proposed semantically-time-referrer based heuristic approach achieves better results than the traditional time and Referrer-time based methods. The proposed methods are not complex to use. Web log file is collected from different servers and contains the public information of visitors. In addition, this paper also discusses different types of web log formats.
Authors and Affiliations
Navjot Kaur, Himanshu Aggarwal
FTL Algorithm using Warm Block Technique for QLC+SLC Hybrid NAND Flash Memory
When applying the existing flash translation layer technique to a mixed NAND flash storage device composed of Quad Level Cell and Single Level Cell, because the characteristics of a semiconductor chip are not taken into...
Classifying Personalization Constraints in Digital Business Environments through Case Study Research
To aid professionals in the early assessment of possible risks related to personalization activities in marketing as well as to give academics a starting point to discover not only the opportunities but also the risks of...
The Use of a Simplex Method with an Artificial basis in Modeling of Flour Mixtures for Bakery Products
Modeling of flour mixtures for bakery products of increased biological value is done. The problem is solved by a simplex method with an artificial basis related to numerical optimization methods for solving linear progra...
Community Detection in Networks using Node Attributes and Modularity
Community detection in network is of vital importance to find cohesive subgroups. Node attributes can improve the accuracy of community detection when combined with link information in a graph. Community detection using...
Evaluating Urdu to Arabic Machine Translation Tools
Machine translation is an active research domain in fields of artificial intelligence. The relevant literature presents a number of machine translation approaches for the translation of different languages. Urdu is the n...