College Recommender System Using Student' Preferences/voting: A System Development With Empirical Study
College Recommender System Using Student' Preferences/voting: A System Development With Empirical Study
1, January 2018 87
large size of the product (object) space and context space. knowledge acquisition and management methods are
The main goal of recommender systems is to assist its needed to build consistent, robust, reliable, fault-tolerant,
users in finding their preferred objects from the large set of and effective decision support systems.
available objects. The voting of a particular customer on a
particular object is learned through a random payoff and
this payoff is received by the recommender system based 2.2 Recommendation systems for college selection
on the response details of the customer to the
Fazeli Soude, et al. [7] said that recommender systems are
recommendation system. For example, in a course
being used (have been using) in many real-world
recommendation, the payoffs are the ratings on the scale of
applications such as e-commerce based applications –
1 to 10, where the ratings are given by the students. In the
Amazon and eBay. Recommender systems must be
case of web page recommendations, the payoffs are
accurate and useful to as many numbers of users as
counted by customers ‘clicks, where the Boolean value 1
possible. The fundamental goal of the educational
denotes a click and 0 denotes no-click.
recommender systems is to satisfy many quality features
It is trivial that web mining is an important technique for such as accuracy, usefulness, effectiveness, novelty,
finding the frequent data patterns from the Internet, data completeness, and diversity. Recommender systems must
warehouse, data mart, and data set and so on. World Wide satisfy user-centric requirements. User-centric based
Web (www) is a powerful platform and it is considered to recommender systems are more useful than data-centric
be the ultimate provider of information super high way recommender systems.
used to store and retrieve information and also to mine
Recommender systems were developed for various
useful knowledge and then use the same for predicting the
domains associated with daily life of people such as
interests /requirements of customers. Web data size is huge,
product recommendation, service recommendations, and
unstructured and dynamic in nature. Hence,
people recommendations and so on. This kind of
recommendation systems are the potentially desired
recommendations increases both user convenience and
information systems used for predicting the feature values
purchase transactions of products and/or services.
according to the requirement of the customer. Web
Course/college recommendation for students is a
recommendation information systems are very useful for
challenging domain that has not reached the target
navigating through web pages and getting the desired
community thoroughly.
information quickly.
Since there are many options for colleges/courses students
Nowadays recommendation systems are popular and they
have to spend a lot of time for exploring the details and
try to suggest different types of items to different users.
they may not do it in a proper way. Students need a system
The items may be books, chairs, tables, pens, movies,
that accepts the students’ preferences and recommends the
music, washing machines, computers, printers, plotters and
right college/course. college selection is one of the issues
so on. For example, Amazon.com recommends various
that the students’ community tends to solve. Recommender
items to various users based on the knowledge – previously
systems help the students decide in what college they
visited, purchased, ordered, enquired, referred, booked and
should study. The methods existing for the
so on.
recommendation are content-based filtering, collaborative
Zhibo Wang, et al. [23] proposed a unique similarity based filtering, and rule mining approaches. Content-based
metric to find the similarity details of users in terms of filtering approach recommends an item to a user by
their lifestyles and they have constructed a Friend book clustering the items and the user pairs into groups. This
system to recommend friends based on their lifestyles. clustering is used to gain similarity between user and item.
Personal information of the user is not considered here.
Recommendation systems have developed in parallel with Queen Esther Booker creates a prototype of a system for
the web technology J. Bobadilla et al. [15]. At the initial course recommendations [18]. The system accepts user
time of their existence, they were based on demographic, requirements as keywords and recommends courses for
content-based and collaborative filtering. Now they are in students.
a position to incorporate social information also. A
knowledge-based recommendation system considers user- Collaborative filtering (CF) approach recommends an item
centric requirements rather than his/her past history in to a user by grouping similar users based on user profiles
order to make recommendations. and predicts the user interests towards the items. Hana
introduces a system based on CF approach to recommend
Hector Nunez, et al. [12] discussed the comparison of courses for a student by analyzing and matching the
different similarity measures for improving the student's academic records [11]. Then the system analyses
classification process. Authors said that automatic
IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018 89
and recommends a course that meets the student’s profile. to find ranking details of objects based on the score
Elham S.Khorasani et al. proposed a Collaborative function value which is based on voting/ preferences to
Filtering model based on Markov Chain to recommend value of attributes. Score function is a linear function that
courses based on historical data [7]. gives sum of the products of attribute values and their
corresponding voting/ preferences. Mathematically the
Rule mining approach focuses on recommending a series linear score function is denoted as
of items to a user by discovering the association rules.
Itmazi and Megias developed a recommendation system
based on rule mining to recommend learning objects [14].
data analysis and decision making. Authors proposed two for efficient reverse k-nearest neighbor search in arbitrary
algorithms for finding most influential database objects. metric spaces (RkNNSAMS) and k value will be given at
The first one uses properties of the sky-band (SB) set for query run time.
limiting the maximum number of resultant candidate
objects and the second one follows branch and bound (BB)
algorithm paradigm and it uses upper bound on influence 4.3 R-tree
score
R-tree is a multidimensional indexing tree data structure.
Many techniques are available for evaluating reverse top-k R-tree is a most popular, frequently used, multidimensional,
queries but only thing is that they are costly in terms of height-balanced special indexing tree data structure and
overhead and hence they require significant processing very useful for efficient management of very large training
which results in the execution of multiple top-k queries for datasets particularly in many real-time applications
finding the total number of customers who prefer the involving data critical operations. Multidimensional R-tree
queried object. The reverse top-k query produces sets of indexing data structure is very useful and efficient for
customers based on object preferences. These sets customer voting based similarity the data structure. In
represent a number of customers who prefer to include the customer voting based similarity data search R-tree
object in their favorite lists. The reverse top-k query is one multidimensional indexing Data Structure is used with
type of tool for estimating impact or demand of the object slight modifications and a finite set of constants applied on
in the market. the bounds similarity values of the query points in inserting
indexing entries.
Vlachou Akrivi et al. [21] proposed a reverse top-k query
with two versions – monochromatic and bichromatic In general, for efficient and fast access to the very large
reverse top-k queries. Authors proposed an efficient datasets, a multidimensional data access technique is
threshold based algorithm for finding bichromatic reverse needed for many real-time tasks. The R-tree
top-k queries. multidimensional indexing tree data structure organizes
data records in the form of hyper-rectangles and these
Amit Singh, et al. [2] proposed an approximate solution to hyper-rectangles usually called minimum bounding
answer reverse nearest neighbor queries in high rectangles (MBRs) organized in the form of a tree
dimensional spaces. Authors said that the approach is hierarchy. R-tree multidimensional indexing tree data
mainly based on a feature called strong co-relation structure is height balanced and all data of objects are
between k-nearest neighbor (k-NN) and reverse the nearest stored in leaves. Small rectangles are included at the
neighbor (RNN) in connection with Boolean range query bottom level and when the R-tree is transferred from
(BRQ). bottom to the top a specific set of lower level small
rectangles are grouped into one big high-level rectangle.
Note that the performance of the reverse top k-query Lee Ken C. K., et al. [16] said that R-tree and its variants,
mainly depends on the number top-k query execution for R+-trees, R*-trees, and aR-trees are data partitioning index
each object and top k-query execution in turn depends on techniques useful for clustering data objects in terms of
voting/ preferences of customers. Reverse top-k query minimum bounding boxes with an abstract mechanism.
retrieves the set of customers to whom the object belongs They proposed a variant of reverse nearest neighbor query
to their top-k result sets. Reverse top-k sets are frequently called ranked reverse nearest neighbor query for searching
used for finding the potential demand of the objects in the and then proposed two algorithms for executing proposed
market. Reverse top-k query executions are costly. Hence query efficiently. These two algorithms are – k-counting
there is a need for approximate reverse top-k query and k-Browsing.
executions both for increased scalability and for speedup
of the overall execution. Also, effective planning Each MBR is defined by two points, lower left corner and
techniques are required. The performance of the R- tree upper right corner and is represented as M (lower x1, y1,
index Data Structure decreases as the dimensionality of the upper x2, y2). In general, the points lower x and y, and
data sets increases and the performance of all the upper x and y may not be part of the actual data set. For
algorithms that are based on R-tree will deteriorate. In such efficient query processing of customer voting based
cases, alternative efficiency and effective indexing similarity data search, index creation is inevitable for large
techniques and algorithms are needed. data and R-tree multidimensional indexing tree data
structure is mandatory for index creation.
Elke Achtert, et al. [6] said that all the existing generalized
reverse k-nearest neighbor (RkNN) search methods are Duc Thang et al. [5] said that fast, usability, simplicity and
only applicable to Euclidian distances but not for general with reasonably good performance features are always
metric objects. As a result, authors proposed first approach
IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018 91
better than the best performing algorithm only in some 5. Proposed algorithm
cases and rare usage of the algorithm because of high
complexity. Data clustering is one of the most important ALGORITHM WCLUSTER (Threshold, Root, D)
topics in data mining. Clustering is a method of arranging INPUT
data objects into convenient and meaningful subgroups for Threshold: user-specified similarity limit
further analysis, study, use, and application for effective Root: indexed tree
data management. At present, the position of k-means D: the dataset
algorithm is in the top-10 list of most important data OUTPUT
mining algorithms. The main advantages of the k-means Set of clusters
algorithm are – scalability, simplicity, robustness,
understandability, fast, and ease of use. The main 1. Initialize cluster number i = 1
disadvantages of it are – selecting initial starting number of 2. While D is not empty do
cluster centers is difficult and its time complexity is O(n2). 3. Object = first object in the D
4. Cluster set c i = Theta-Similarity-Query (Root,
Charif haydar and Anne Boyer [3] proposed a clustering Threshold, Object)
algorithm called mutual vote (MV) based on a statistical 5. i=i+1
model. Authors said that their proposed clustering 6. update input dataset D = D - c i
algorithm adjusts automatically to the data set and requires 7. End-While
minimum parameters.
DINO IENCO, et al. [4] said that the process of ALGORITHM Theta-Similarity-Query (Root, theta, q)
clustering data objects containing only categorical Input
attributes is a tedious task because defining a distance Root: root node of the R-tree
value between pairs of categorical attributes is difficult. theta: is the similarity measure threshold value
Authors proposed a framework to find a distance measure q: is the query object
between categorical attributes. Madhavi et al. [15] Output
formulated measures on the data containing categorical result-set: is the set of similar objects
attributes. They categorized existing measures as context-
free and context-sensitive measures for categorical data. 1. node = create a new tree node
Usue Mori et al. [20] said that the most famous Euclidian 2. node = Root
distance and the common measures used for non-temporal 3. if (minimum-similarity(node ,q) ≥ theta) then
data are not always the best methods for finding similarity 4. result-set = result-set UNION p for every sub-
between time series data because they do not deal with tree (node)
noise and misalignments in the time series data. Authors 5. end-if
said that Euclidian distance suffers from noise and outliers 6. if (node.type = leaf-node) then
problem. 7. for every p i in the node do
8. reverse p i vector = execute reverse top-k (p i )
Yung-Shen Lin et al. [22] said that similarity measures are 9. if (minimum-similarity(p i, q) ≥ theta) then
being used extensively in text classification and clustering. 10. result-set = result-set UNION p i
In the literature, various methods used for similarity 11. end-if
comparison are - Euclidian distance, Manhattan distance, 12. end-for
taxicab distance, cosine similarity measure, city-block 13. else
distance, Bray-Curties measure, Jaccard coefficient, 14. for every sub-tree of node do
extended Jaccard coefficient, Hamming distance, Dice 15. if (maximum–similarity(sub-tree , q)) ≥ theta then
coefficient, IT-Sim and so on. Authors have proposed a 16. node = sub-tree(node)
new measure for computing the similarity between two 17. end-if
documents and they have extended to measure the 18. end-for
similarity between two sets of documents. The proposed 19. end-if
measure is applied in many real applications such as k- 20. if (node is not empty) then
means like clustering, classification, and hierarchical 21. Theta-similarity-Query (node, theta, q)
clustering. 22. end-if
23. return (result-set)
P:is the object (college) presents in the leaf node medicine, profile, mobile, wine and so on. R-tree index
of the R-Tree structure is mainly used for a fast searching purpose.
q:is the queried object(college) During each of the search operation in each iteration a
Output node is examined and if the node satisfies the maximum-
a numeric value representing the similarity similarity value greater than or equal to the theta value,
measure between two objects then all the nodes within the sub-tree of the node are
1. a= total list of students referenced the college recursively searched and all the tuples of each node are
object p processed based on the minimum similarity condition some
2. b= total list of students referenced the college tuples or objects are added to the result set. Whenever a
object q leaf node is referenced Jaccard similarity measure is
3. similarity = applied to all the objects of the leaf node by executing
reverse top-k query for each object and at the sometimes
4. return similarity similarity measure, similar (p, q) greater than or equal is
also tested and the corresponding object is added to the
Reverse Top-k computation Algorithm Reverse Top-k result set during the computation of the similarity measure
Full() different types of pruning techniques are applied.
Reverse Topk[][]=new int [college][students]
for i=1 to number of colleges do
{
Col=0 6. Comparison of proposed algorithm with
for j=0 to number of elements in each rows in top-k traditional methods
resultsset
{ The data grouping in recommender systems traditionally
for k=0 to number of elements in row follows k-means approach. This k-means approach treats
{ each attribute alike and does not consider weights with
if (topkresultset == I ) then respect to priority attributes. In addition, the traditional
reverseTopk[i-1][col++]= j+1 approach needs high computational effort. The proposed
} approach using R-Tree saves a significant amount of
} computation time. The traditional approach needs
} comparatively more iterations for clustering than the
proposed R-tree based method. The time complexity of the
Algorithm reverseTopklist(obj) proposed approach is sub-linear, whereas the traditional
Input methods like k-means algorithm need O (n2) of time.
Obj:collegeObject Time complexity of search operation in R-Tree is O (log n)
Output in the best case when all the colleges belong to a single
List of students cluster and the R-Tree is called once. Hence best case time
for i=1 to number of colleges do complexity is O (log n). In the worst case when no two
{ engineering colleges have same profile of attributes then
if (collegelist[i][1]=obj) then the R-Tree is called n times where n is the number of
return ith row list in reverseTopk[i] engineering colleges. Hence worst case time complexity of
endif proposed algorithm is O (n log n). The average case time
} complexity of the algorithm may be anywhere between
endfor O(log n) and O(n log n)and it can be computed in best way
as
The WCLUSTER algorithm makes use of the above
similarity search algorithm. WCLUSTER provides the ≈ + ≈ ≈O(nlog n)
exhaustive set of clusters. For each step of the iterative Hence, best case, average case and worst case time
process, a cluster is separated from the whole dataset and complexities of proposed algorithm respectively are O (log
the remaining dataset is the candidate for the next iteration. n), O (n log n) and O (n log n). In many real time cases
The process ends when all the elements of the master average time complexity is considered to be the best
dataset have been clustered. estimator for algorithm time complexity.
The algorithm, Theta-Similarity-Query, returns all the Hence in terms of time complexity proposed algorithm is
similar objects of the given object q. The object may be superior than many of the traditional clustering algorithms.
any one of the items such as a tuple, product, book, patient,
IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018 93
Georgoulas Konstantinos, et al. [10] introduced a new observing the two graphs shown in Figure-1 and Figure-2
user-centric approach for finding object’s similarities. New it is clear that for the datasets with small sizes the
approach considers not only values of attributes of objects difference between execution times of existing k-means
but also preferences of attributes of objects are used in algorithm and proposed W-clustering algorithm is very
finding similarities between objects. Authors said that small and the difference in execution times will increase
proposed technique is very much useful for business rapidly as the sizes of datasets increase. For very large
organizations in finding business status details of a datasets the algorithm k-means is not scalable whereas the
particular product/object and a more efficient, effective, proposed W-cluster algorithm is scalable to the maximum
optimal marketing business policy can be established and extent and it is suitable for many real world applications
products can be clustered based on the preferences of because of the possible large data indexing capability of
customers. the R-tree indexing technique power. Figure-3 shows that
number of clusters in the proposed W-cluster technique
Table 1: Existing K-means clustering algorithm execution times increases gradually as the size of the dataset increases
Sno Number of Execution time Clusters
colleges in seconds
1 50 9 6
2 100 23 9
3 150 53 13
4 200 94 14
5 296 170 18
8. The application
College recommender system is implemented in java and
its main application is to take an optimal decision in
selecting the best college for EAMCET admissions. The
proposed algorithm was applied to a college data set
Fig. 2 Execution times of k-means and proposed algorithms having 296 records in which each record contains 7
attributes. The present system also uses student data set
Experimentally obtained execution time details of both which contains their individual preferences of various
existing K-means clustering algorithm and proposed W- attributes pertaining to various colleges. During the
clustering algorithm with R-tree are respectively shown in process of college clustering, both the above data sets are
the tables TABLE-1 and TABLE-2. Two different graphs, used. The execution process is applied by dividing the data
column chart and line chart, are drawn in Fihure-1 and sets into different cases using both fixed and variable
Figure-2 respectively for the experimentally obtained data parameters. Experimentally obtained results are placed in
shown in TABLE-1 and TABLE-2 respectively. After the form of tables and figures.
94 IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018
[206, 228, 230, 235, 236, 239, 240, 251, 254, 259, 276, 280, 290, 291, 292, 293, 294, 41, 57, 68, 70]
[1, 104, 105, 106, 108, 112, 113, 114, 115, 117, 118, 120, 121, 122, 124, 125, 136, 139, 14, 140, 142, 143, 144, 145, 146, 147, 148, 149, 150,
153, 154, 161, 167, 168, 169, 170, 172, 173, 179, 182, 185, 186, 282, 284, 285, 287, 29, 291, 292, 294, 30, 31, 33, 34, 35, 36, 37, 39, 40]
[41, 46, 48, 49, 51, 52, 56, 57, 6, 60, 63, 64, 65, 66, 67, 68, 7, 70, 71, 73, 82, 83, 84, 85, 89, 94, 97,187, 189, 190, 192, 193, 196, 197, 199, 20,
201, 202, 203, 204, 206, 207, 208, 209, 213, 214, 216, 217, 218, 22, 221, 222, 223, 224]
3 min 55 [228, 229, 230, 231, 234, 235, 236, 239, 240, 241, 242, 245, 246, 247, 249, 25, 251, 254, 255, 257, 26, 260, 265, 268, 270, 275, 276, 279, 28,
2 10 6 280,
sec [259, 267, 288, 290, 293]
[112, 172, 173, 204, 22, 251, 254, 255, 265, 275, 279, 280, 282, 284, 285, 291, 292, 294, 37, 39, 57, 64, 68, 7]
[22, 240, 251, 254, 276, 291, 292, 294, 37, 41, 57, 68, 7]
[1, 104, 105, 106, 108, 112, 113, 114, 115, 117, 118, 120, 121, 122, 124, 125, 136, 139, 14, 140, 142, 143, 144, 145, 146, 147, 148, 149, 150,
153, 154, 161, 167, 168, 169, 170, 172, 173, 179, 182, 185, 186, 25, 251, 254, 255, 257, 26, 260, 265, 268, 270, 275, 276, 279, 28, 280, 282,
284, 285, 287, 29, 291, 292, 294, 30, 31, 33, 34, 35, 36, 37, 39, 40]
[187, 189, 190, 192, 193, 196, 197, 199, 20, 201, 202, 203, 204, 206, 207, 208, 209, 213, 214, 216, 217, 218, 22, 221, 222, 223, 224, 228,
229, 230, 231, 234, 235, 236]
3 min 52
3 15 6 [41, 46, 48, 49, 51, 52, 56, 57, 6, 60, 63, 64, 65, 66, 67, 68, 7, 70, 71, 73, 82, 83, 84, 85, 89, 94, 97, 239, 240, 241, 242, 245, 246, 247, 249,
sec 259, 267, 288, 290, 293]
[112, 172, 173, 204, 22, 251, 254, 255, 265, 275, 279, 280, 282, 284, 285, 291, 292, 294, 37, 39, 57, 64, 68, 7]
[22, 240, 251, 254, 276, 291, 292, 294, 37, 41, 57, 68, 7]
Table 8: total data set size versus variable sizes of weights and execution
times
Serial Maximum Execution Time Number of
No. Students Clusters
1 100 5 min 9 sec = 309 20
2 200 8 min 41 sec = 521 17
3 300 11 min 29 sec = 689 20 Fig. 7. Relation between similarity measure and number of clusters
4 400 14 min 5 sec = 845 21
5 500 17 min 4 sec = 1024 21
6 600 20 min 21 sec = 1221 21 FIGURE-7 shows that total number of clusters generated
7 700 23 min 56 sec = 1436 20
8 800 27 min 12 sec = 1632 21 will be decreased smoothly in a continuous manner as the
9 900 31 min 55 sec = 1925 21 similarity between cluster objects increases and this is true
10 1000 35 min 49 sec = 2149 21 because when the similarity threshold value set is very high
96 IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018
then many objects will not satisfy set threshold similarity their similarity threshold value is very less, which results
value and consequently not included in any of the clusters. decrease in the number of clusters.
Many objects are excluded from the clustering process, as
Table 11: represents the summarization of the results of the first three cases executed. The details of the rest of the cases resembling the first three cases
and so were not summarized again.
S Number of Number of Execution time Number of
Actual clusters
No. Colleges Students in sec clusters
[14, 20, 22, 25, 26, 28, 29, 30, 31, 33, 34, 35, 36, 37, 39, 40, 41, 46, 48, 49, 6, 7]
1 50 50 12 2
[112, 167, 172, 173, 193, 204, 22, 228, 230, 235, 236, 239, 240, 255, 265, 275, 276, 282, 285, 33,
35, 36, 37, 39, 41, 56, 64, 7, 70, 84]
[1, 14, 20, 22, 31, 34, 37, 48, 49, 51, 52, 6, 63, 7, 73, 82, 83, 89, 94, 97]
[28, 29, 30, 33, 35, 36, 39, 40, 41, 46, 56, 57, 60, 65, 66, 67, 68, 70, 71, 84, 85]
2 100 100 81 4
[1, 22, 31, 37, 63, 7, 83, 89, 97]
[25, 26, [64, 63]
[1, 104, 105, 106, 108, 112, 113, 114, 115, 117, 118, 120, 124, 125, 136, 14, 140, 146, 147, 150,
20, 22, 25, 26, 31, 34, 37, 48, 49, 51, 52, 6, 63, 64, 7, 73, 82, 83, 89, 94, 97]
[121, 122, 139, 142, 143, 144, 145, 30, 33, 35, 36, 40, 41, 46, 56, 57, 60, 65, 66, 67, 68, 70, 71,
85]
3 150 150 145 5
[1, 112, 147, 22, 31, 37, 52, 63, 64, 7, 73, 82, 83, 89, 97, [122, 29, 33, 35, 36, 39, 40, 41, 46, 56,
57, 67, 68, 70, 84]
[147, 20, 6, 63, 148, 149, 28, 33, 35, 36, 41, 56, 57, 67, 68, 70]
[150, 104, 83, 94, 97, 85, 33, 35, 36, 56, 84]
SNo. Number of
clusters
Number of Students Execution time in Number of
10. Conclusions
Colleges sec clusters
1 50 50 12 2 A novel technique for college recommendation was
2 100 100 81 4
3 150 150 145 5 presented. A well potent problem of college recommender
4
5
200
296
200
300
211
836
5
18
system was undertaken to solve with the proposed
6 296 400 836 20 grouping and recommendation technique. The proposed
technique is mostly suitable for present trends of data
available. Intelligent and time saving recommendation
systems can be developed embedding the proposed R-Tree
and top-k query approaches. The same was implemented
and applied to develop a recommender system for college
selection based on students’ preferences. The results
showed that the proposed technique is more reliable, more
intelligent and faster than the existing approaches.
A novel technique for top engineering college
recommendation is developed. A well potent problem of
Fig. 8 relationships among colleges, students, execution time and clusters engineering college recommender system for students is
undertaken to salve many of the problems that frequently
FIGURE-8 shows the relationships among colleges, occur during EAMCET admission process with respect to
students, execution times and clusters formed after student voting/preference/rating/opinions. A new
execution. Number of colleges and execution times intelligent and time saving system is developed based on
increase linearly up to a certain point beyond that point approaches R-Tree, Top-K query and voting/preference of
execution time curve follows exponential growth rate as is the students. The developed system is tested on the data
the case with many real world large data sets. Number of collected from various engineering colleges. The college
clusters increases smoothly as the number of colleges and data set represents all the profile attributes of engineering
students increases. colleges. Also students voting/preferences are collected
with respect to college attributes and used in the present
recommendation system. Experimental results show that
proposed system is reliable, faster, intelligent and more
useful for aspirants of engineering college admissions. In
IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018 97
the feature the system can be extended for admissions like [11] Hana Bydžovská. Course Enrollment Recommender
IITs, IIITs, and NITs and so on. In future the same setup System: Proceeding of the 9th International Conference on
can be extended for many more applications relating to Educational Data Mining, P. 312 – 317.
[12] Hector Nunez, Miquel sanchez-Marre, Ulises Cortes,
recommender systems that can exhibit the same
Joaquim Comas, Montse Martinez, Ignasi Rodriguez-Roda,
betterments. Manel Poch, “A Comaprative study on the use of similarity
measure in case based reasoning to improve the
classification of environmental system situations,”,
Acknoledgement ELSEVIER, Environmental Modeling and Software XX
(2003) xxx-xxx.
To collect the related data, an online survey is conducted [13] HristidisVagelis, Nick Koudas, Yannis Papakonstantinou,
“PREFER: A System for the Efficient Execution of
using a questionnaire. I am always thankful to all and
Multiparametric Ranked Queries”, ACM SIGMOD ’2001
sundry who participated and cooperated in data collection. Santa Barbara, California,USA
[14] Jamil Itmazi and Miguel Megias (2008), Using
References recommendation Systems in Course Management Systems
[1] Akrivi Vlachou, Christos Doulkerids, Kjetil Norvag, and to Recommend Learning Objects, P. 234 - 240.
Yannis Kotidis, “Identifying the Most Influential Data [15] J. Bobadilla et al. “Knowledge-Based System” 2013
Objects with Reverse Top-k Queries,” Proceedings of the Elsevier B.V.
VLDB Endowment, Vol. 3, No. 1, Copy right 2010 VLDB [16] Lee Ken C. K., Baihua Zheng, Wang-Chien Lee, “Ranked
Endowment 2150-8097/10/09 Reverse Nearest Neighbor Search”, IEEE Transactions on
[2] Amit Singh, Hakan Ferhatosmanoglu, and Ali Saman Tosun, knowledge and Data Engineering. Vol. 20, No.7, July 2008
“High Dimensional Reverse Nearest Neighbor Queires,” [17] Madhavi Alamuri, Bapi raju Surampudi and Atul Negi, “A
CIKM’03, November 3-8, 2003, New Orleans, Louisiana, Survey of Dustance / Similarity Measure for categorical
USA, copyright 2003 ACM 1-58113-723-0/03/0011 Data,” 2014 International Joint conference on Neural
[3] C.C. Aggarwal, Recommender Systems: The Textbook, DOI Networks (IJCNN), July 6-11, 2014, Beijing, china.
10.1007/978-3-319-29659-3 1© Springer International [18] Queen Esther Booker (2009). A Student Program
Publishing Switzerland 2016 Recommendation System Prototype: Issues in Information
[4] Charif Haydar, Anne Boyer, “A New Statistical Density Systems, P. 544 - 551.
Clustering Algorithm based on Mutual Vote and Subjective [19] Subba Reddy.Y and Prof. P. Govindarajulu,” A survey on
Logic Applied to Recommender Systems”, UMAP 2017 data mining and machine learning techniques for internet
Full Paper UMAP’17, July 9- 12, 2017,Bratislava,Slovakia voting and product/service selection”, IJCSNS International
[5] DINO IENCO, RUGGERO G. PENSA and ROSA MEO, Journal of Computer Science and Network Security,
“From Context to Distance: Learning Dissimilarity for VOL.17 No.9, September 2017
categorical Data Clustering,” ACM Journal Vol. X. 10 2009, [20] Usue Mori, Alexander Mendiburu, and Jose A.Lozano,
pages 1- 0?? “Similarity Measure Selection for Clustering Time Series
[6] Duc Thang Nguyen, Lihui Chen, Chee keong Chan, databases,” IEEE Transactions on Knowledge and Data
“Clustering with Multiviewpoint-Based Similarity Engineering. Vol. 28. No. 1. January 2016
Measure,” IEEE Transactions on Knowledge and Data [21] Vlachou Akrivi, Charitos Doulkeridis, Yannis Kotidis,
Engineering. Vol. 24. No. 6. June 2012 Kjetil Nrvag, “Reverse Top-k Queries”, ICDE Conference
[7] Elham S.Khorasani, Zhao Zhenge, and John Champaign. 2010 978-1-4244-5446-4/10
AMarkov Chain Collaborative Filtering Model for Course [22] Yung-Shen Lin, Jung-Yi Jiang, and Shie-Jue Lee, “A
Enrollment Recommendations: 2016, “IEEE International Similarity Measure for Text Classification and Clustering,”
Conference on Big Data (Big Data)”, P. 3484 – 3490 IEEE Transactions on Knowledge and Data Engineering.
[8] Elke Achtert, Christian Bohm, Peer Kroger, Peeter Kunath, Vol. 26. No. 7. July 2014
Alexy Pryakhin, Matthias, “ Efficient Reverse k-Nearest [23] Zhibo Wang, Jilong Liao, Qing Cao, Hairong Qi, and Zhi
Neighbor Search in Arbitrary Metric Spaces,” SIGMOD Wang, “Friend book: A Semantic-based Friend
2006 June 27-29, 2006 Chicago, Illinois, USA. Recommendation System for Social Networks”, IEEE
[9] Fazeli Soude, Hendrik Drachsler, Marlies Bitter-Rijpkema, Transactions on Mobile Computing.
Francis Brouns, Wim van der Vegt, and Peter B. Sloep,
“User-centric Evaluation of Recommender Systems in Y.Subba Reddy received M.Sc (Computer
Social Learning Platforms: Accuracy is Just the Tip of the Science) degree from Bharathidasan
Iceberg”, IEEE Transactions on Learning Technologies, University, Tiruchirapalli, TN and M.E
August 26, 2015 degree in Computer Science &
[10] Georgoulas Konstantinos, Akrivi Vlachou, Christos Engineering from Sathyabama University,
Doulkeridis, and Yannis Kotidis, “User-Centric Similarity Chennai, TN. He is a research scholar in
Search,” IEEE Transactions on Knowledge and Data the Department of Computer Science, Sri
Engineering, Vol. 29, No. 1, January 2017 Venkateswara University, Tirupati, AP,
India. His research focus is on Data
Mining in Clustering and Similarity measures.
98 IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.1, January 2018