PREPROCESSING OF WEB LOGS
Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 7
Abstract
Today’s real world databases are highly susceptible to noisy, missing and inconsistent data due to their typically huge size data and their origin from multiple, heterogeneous sources. Hence, pre-processing of data is necessary to help improve the quality of data and consequently the mining results. There are number of data pre-processing techniques. In this paper, we would like to discuss two different approaches for data reprocessing one based on XML and other based on text file. But the basic algorithm and steps involved in re-processing are considered same for both the approaches.
Authors and Affiliations
Ms. Dipa Dixit , Ms. M Kiruthika
Question Categorization Using SVM Based on Different Term Weighting Methods
This paper deals with the performance of Question Categorization based on four different term weighting methods. Term weighting methods such as tf*idf, qf*icf, iqf*qf*icf and vrf together with SVM classifier were used fo...
An Integer Programming-based Local Search for Large-Scale Multidimensional Knapsack Problems
Integer programming-based local search (IPbLS) is a metaheuristic recently proposed for solving linear combinatorial optimization problems. IPbLS is basically the same as the first-choice hillclimbing except for using in...
Performance of machine learning methods for classification tasks
In this paper, the performance of various machine learning methods on pattern classification and recognition tasks are proposed. The proposed method for evaluating performance will be based on the feature representation,...
MRI Brain Image Tissue Segmentation analysis using Possibilistic Fuzzy C-means Method
In this paper, we analyzed the segmentation of MRI brain image into different tissue types on brain image using Possibilistic fuzzy c-means (PFCM) clustering. Application of this method to MRI brain image gives the bette...
Augmentation of Block Truncation Coding based Image Retrieval by using Even and Odd Images with Sundry Colour Spaces
The augmentation to block truncation coding (BTC) based image retrieval techniques using Even and Odd images with ten different colour spaces is the theme of work given in the paper. Here the original image is reflected...