Editorial
A framework for detecting deviations in complex event logs
Deviating behavior within an organization can lead to unexpected results. The effects of deviations are often negative, but sometimes also positive. Therefore, it is useful to detect deviations from event logs which record all the behavior of the ...
Multidimensional benchmarking in data warehouses
Benchmarking is among the most widely adopted practices in business today. However, to the best of our knowledge, conducting multidimensional benchmarking in data warehouses has not been explored from a technical efficiency perspective. In this ...
A novel data reduction method based on information theory and the Eclectic Genetic Algorithm
A common task in data analysis is to find the appropriate data sample whose properties allow us to infer the parameters and behavior of the data population. In data mining this task makes sense since usually the population is significantly huge, ...
ClusterMPP: An unsupervised density-based clustering algorithm via Marked Point Process
Conventional clustering algorithms optimize a single criterion, which may not conform to diverse needs of multidimensional data science. This paper proposes a new clustering algorithm that solves multiple clustering issues, called clustering by ...
Unsupervised event exploration from social text streams
Social media provides unprecedented opportunities for people to disseminate information and share their opinions and views online. Extracting events from social media platforms such as Twitter could help in understanding what is being discussed. ...
Learning speed of supervised neural networks as similarity measurement in unsupervised cluster analysis
Cluster analysis or clustering is one of the most important and widely used techniques for data exploration and knowledge discovery that concerned with partitioning a set of objects in such a way that objects in the same groups, called clusters, ...
Nonparametric multi-assignment clustering
Multi-label learning has attracted significant attention from machine learning and data mining over the last decade. Although many multi-label classification algorithms have been devised, few research studies focus on multi-assignment clustering (...
Instance-based classification with Ant Colony Optimization
Instance-based learning (IBL) methods predict the class label of a new instance based directly on the distance between the new unlabeled instance and each labeled instance in the training set, without constructing a classification model in the ...
Assessing university enrollment and admission efforts via hierarchical classification and feature selection
Recruiting prospective students efficiently and effectively is a very important challenge for universities, mainly because of the increasing competition and the relevance of enrollment-generated revenues. This work provides an intelligent system ...
Dynamic sparsity control in Deep Belief Networks
A Deep Belief Network (DBN) is a generative probabilistic graphical model that contains many layers of hidden variables and has excelled among deep learning approaches. DBN can extract suitable features, but improving these networks for obtaining ...
Apriori and GUHA – Comparing two approaches to data mining with association rules
Two approaches to data mining with association rules are compared – the apriori algorithm and the ASSOC procedure. The first one was developed for market basket analysis at the beginning of 1990s. An association rule is understood as an ...
A guidance of data stream characterization for meta-learning
- André Luis Debiaso Rossi,
- Bruno Feres de Souza,
- Carlos Soares,
- André Carlos Ponce de Leon Ferreira de Carvalho
The problem of selecting learning algorithms has been studied by the meta-learning community for more than two decades. One of the most important task for the success of a meta-learning system is gathering data about the learning process. This ...