A Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 1
Abstract
Web usage mining(WUM) , also known as Web Log Mining is the application of Data Mining techniques, which are applied on large volume of data to extract useful and interesting user behaviour patterns from web logs, in order to improve web based applications. This paper aims to improve the data discovery by mining the usage data from log files. In this paper the work is done in three phases. First and second phase0 which are data cleaning and user identification respectively are completed using traditional methods. The third phase, session identification is done using three different methods. The main focus of this paper is on sessionization of log file which is a critical step for extracting usage patterns. The proposed referrer-time and Semantically-time-referrer methods overcome the limitations of traditional methods. The main advantage of pre-processing model presented in this paper over other methods is that it can process text or excel log file of any format. The experiments are performed on three different log files which indicate that the proposed semantically-time-referrer based heuristic approach achieves better results than the traditional time and Referrer-time based methods. The proposed methods are not complex to use. Web log file is collected from different servers and contains the public information of visitors. In addition, this paper also discusses different types of web log formats.
Authors and Affiliations
Navjot Kaur, Himanshu Aggarwal
Role of Requirements Elicitation & Prioritization to Optimize Quality in Scrum Agile Development
One of most common aspect with traditional software development is managing requirements. As requirements emerge throughout the software development process and thus are needed to be addressed through proper communicatio...
Reverse Engineering State and Strategy Design Patterns using Static Code Analysis
This paper presents an approach to detect behavioral design patterns from source code using static analysis techniques. It depends on the concept of Code Property Graph and enriching graph with relationships and properti...
A Proposed Multi Images Visible Watermarking Technique
Visible watermarking techniques are proposed to secure digital data against unauthorized attacks. These techniques protect data from illegal access and use. In this work, a multi visible watermarking technique that allow...
Using Real-World Car Traffic Dataset in Vehicular Ad Hoc Network Performance Evaluation
Vehicular ad hoc networking is an emerging paradigm which is gaining much interest with the development of new topics such as the connected vehicle, the autonomous vehicle, and also new high-speed mobile communication te...
Determining the Efficient Structure of Feed-Forward Neural Network to Classify Breast Cancer Dataset
Classification is one of the most frequently encountered problems in data mining. A classification problem occurs when an object needs to be assigned in predefined classes based on a number of observed attributes related...