When the data mining procedures deals with the extraction of interesting knowledge from web logs is known as Web usage mining. The result of any mining is successful, only if the dataset under consideration is well preprocessed. One of... more
When the data mining procedures deals with the extraction of interesting knowledge from web logs is known as Web usage mining. The result of any mining is successful, only if the dataset under consideration is well preprocessed. One of the important preprocessing steps is handling of null/missing values. Handlings of null values have been a great bit of test for researcher. Various methods are available for estimation of null value such as k-means clustering algorithm, MARE algorithm and fuzzy logic approach. Although all these process are not always efficient.
We propose an efficient approach for handling null values in web log. We are using a hybrid tabu search – k nearest neighbor classifier with multiple distance function. Tabu search – KNN classifier perform feature selection of K-NN rules. We are handling null values efficiently by using different distance function. It is called Ensemble of function. It gives different set of feature vector. Feature selection is useful for improving the classification accuracy of NN rule. We are using different distance metric with different set of feature, so it reduces the possibility that some error will common. Therefore, proposed method is better for handling null values.
The proposed method is using hybrid classifier with different distance metrics and different feature vector. It is evaluated using our MANIT database. Results have indicated that a significant increase in the performance when compared with simple K-NN classifier.