K-nearest neighbors algorithm

18 Followers

Recent papers in K-nearest neighbors algorithm

Top Papers
Most Cited Papers
Most Downloaded Papers
Newest Papers
People

Combinations of Jaccard with Numerical Measures for Collaborative Filtering Enhancement: Current Work and Future Proposal

Collaborative filtering (CF) is an important approach for recommendation system which is widely used in a great number of aspects of our life, heavily in the online-based commercial systems. One popular algorithms in CF is the K-nearest neighbors (KNN) algorithm, in which the similarity measures are used to determine nearest neighbors of a user, and thus to quantify the dependency degree between the relative user/item pair. Consequently, CF approach is not just sensitive to the similarity measure, yet it is completely contingent on selection of that measure. While Jaccard - as one of those commonly used similarity measures for CF tasks - concerns the existence of ratings, other numerical measures such as cosine and Pearson concern the magnitude of ratings. Particularly speaking, Jaccard is not a dominant measure, but it is long proven to be an important factor to improve any measure. Therefore, in our continuous efforts to find the most effective similarity measures for CF, this research focuses on proposing new similarity measure via combining Jaccard with several numerical measures. The combined measures would take the advantages of both existence and magnitude. Experimental results on, Movie-lens dataset, showed that the combined measures are preeminent outperforming all single measures over the considered evaluation metrics.

Bookmark
Download
- by Loc Nguyen's Academic Network and +1
  Loc Nguyen
- •
- 5
  Similarity Measures, Recommendation Systems, Jaccard Index, K-nearest neighbors algorithm

AN OPTIMIZED SYSTEM TO SOLVE TEXT-BASED CAPTCHA

CAPTCHA(Completely Automated Public Turing test to Tell Computers and Humans Apart) can be used to protect data from auto bots. Countless kinds of CAPTCHAs are thus designed, while we most frequently utilize text-based scheme because of most convenience and user-friendly way [1]. Currently, various types of CAPTCHAs need corresponding segmentation to identify single character due to the numerous different segmentation ways. Our goal is to defeat the CAPTCHA,thus rstly the CAPTCHAs need to be split into character by character. There isn't a regular segmentation algorithm to obtain the divided characters in all kinds of examples, which means that we have to treat the segmentation individually. In this paper, we build a whole system todefeat the CAPTCHAs as well as achieve state-of-the-art performance.In detail, we present our self-adaptive algorithm to segment different kinds of characters optimally, and then utilize both the existing methods and our own constructed convolutional neural network as an extra classfier. Results are provided showing how our system work well towards defeating these CAPTCHAs.

Bookmark
Download

An Efficient Diagnosis System for Detection of Liver Disease Using a Novel Integrated Method Based on Principal Component Analysis and K-Nearest Neighbor (PCA-KNN)

Talking about organ failure and people immediately recall kidney diseases. On the contrary, there is no such alertness about liver diseases and its failure despite the fact that this disease is one of the leading causes of mortality worldwide. Therefore, an effective diagnosis and in time treatment of patients is paramount. This study accordingly aims to construct an intelligent diagnosis system which integrates principle component analysis (PCA) and k-nearest neighbor (KNN) methods to examine the liver patient dataset. The model works with the combination of feature extraction and classification performed by PCA and KNN respectively. Prediction results of the proposed system are compared using statistical parameters that include accuracy, sensitivity, specificity, positive predictive value and negative predictive value. In addition to higher accuracy rates, the model also attained remarkable sensitivity and specificity, which were a challenging task given an uneven variance among a...

Bookmark
Download

Improving k-Nearest Neighbour Classification with Distance Functions Based on Receiver Operating Characteristics

Bookmark
Download

Using k-Nearest Neighbor and Feature Selection as an Improvement to Hierarchical Clustering

Bookmark
Download
- by Stefanos Kollias
- •
- 17
  Set Theory, Computer Science, Algorithms, Database Systems

Stock Price Prediction Using K-Nearest Neighbor (kNN) Algorithm

Stock prices prediction is interesting and challenging research topic. Developed countries' economies are measured according to their power economy. Currently, stock markets are considered to be an illustrious trading field because in many cases it gives easy profits with low risk rate of return. Stock market with its huge and dynamic information sources is considered as a suitable environment for data mining and business researchers. In this paper, we applied k-nearest neighbor algorithm and non-linear regression approach in order to predict stock prices for a sample of six major companies listed on the Jordanian stock exchange to assist investors, management, decision makers, and users in making correct and informed investments decisions. According to the results, the kNN algorithm is robust with small error ratio; consequently the results were rational and also reasonable. In addition, depending on the actual stock prices data; the prediction results were close and almost par...

Bookmark
Download
- by Mohammed Shatnawi
- •
- 3
  Business, Computer Science, K-nearest neighbors algorithm

Path Normalcy Analysis Using Nearest Neighbor Outlier Detection

Bookmark
Download

Advanced Cosine Measures for Collaborative Filtering

Cosine similarity is an important measure to compare two vectors for many researches in data mining and information retrieval. In this research, cosine measure and its advanced variants for collaborating filtering (CF) are evaluated. Cosine measure is effective but it has a drawback that there may be two end points of two vectors which are far from each other according to Euclidean distance, but their cosine is high. This is negative effect of Euclidean distance which decreases accuracy of cosine similarity. Therefore, a so-called triangle area (TA) measure is proposed as an improved version of cosine measure. TA measure uses ratio of basic triangle area to whole triangle area as reinforced factor for Euclidean distance so that it can alleviate negative effect of Euclidean distance whereas it keeps simplicity and effectiveness of both cosine measure and Euclidean distance in making similarity of two vectors. TA is considered as an advanced cosine measure. TA and other advanced cosine measures are tested with other similarity measures. From experimental results, TA is not a preeminent measure but it is better than traditional cosine measures in most cases and it is also adequate to real-time application. Moreover, its formula is simple too.

Bookmark
Download
- by Loc Nguyen's Academic Network and +1
  Loc Nguyen
- •
- 5
  Collaborative Filtering, Similarity Measures, Cosine Similarity, K-nearest neighbors algorithm

Evaluating a Nearest-Neighbor Method to Substitute Continuous Missing Values

Bookmark
Download
- by Eduardo Hruschka
- •
- 8
  Computer Science, Data Mining, Breast Cancer, Artificial Intelligence (AI)

Architecture Reduction of a Probabilistic Neural Network by Merging K–Means and K–Nearest Neighbor Algorithms

Bookmark

Non-rigid Surface Registration using Cover Tree based Clustering and Nearest Neighbor Search

Bookmark
Download

Comparative Analysis of Data Structures for Approximate Nearest Neighbor Search

Similarity searching has a vast range of applications in various fields of computer science. Many methods have been proposed for exact search, but they all suffer from the curse of dimensionality and are, thus, not applicable to high dimensional spaces. Approximate search methods are considerably more efficient in high dimensional spaces. Unfortunately, there are few theoretical results regarding the complexity of these methods and there are no comprehensive empirical evaluations, especially for non-metric spaces. To fill this gap, we present an empirical analysis of data structures for approximate nearest neighbor search in high dimensional spaces. We provide a comparison with recently published algorithms on several data sets. Our results show that small world approaches provide some of the best tradeoffs between efficiency and effectiveness in both metric and non-metric spaces.

Bookmark
Download

K-nearest neighbors algorithm

Log In