A Comparative Study Between Feature Selection Algorithms - Ok

A comparative study was conducted between four feature selection algorithms: decision trees, entropy measure for ranking features, estimation of distribution algorithms, and the bootstrapping algorithm. Each algorithm was applied to different datasets and the results were analyzed. The study found that feature selection is useful for eliminating noise and improving dataset quality, and that the effectiveness of algorithms can depend on the specific dataset.

Uploaded by

Oscar Mauricio Aragón

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

A Comparative Study Between Feature Selection Algorithms - Ok

Uploaded by

Oscar Mauricio Aragón

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

A comparative study between feature selection

algorithms

Víctor H. Medina G ., Jorge E. Rodríguez R .

Abstract. In this paper we show a comparative study between four

algorithms used in features selection; these are: decision trees, entropy
measure for ranking features, estimation of distribution algorithms, and
the bootstrapping algorithm. Likewise, the features selection is
highlighted as the most representative task in the elimination of noise,
in order to improve the quality of the dataset. Subsequently, each
algorithm is described in order that the reader understands its function.
Then the algorithms are applied using different data sets and obtaining
the results in the selection. Finally, the conclusions of this investigation
are presented.
Keywords: Features selection, bootstrapping algorithm, decision trees,
entropy theory, and estimation of distribution algorithms.

1 Introduction
Usually, database contain millions of tuples and thousands of attributes, presenting
dependencies among attributes [1]. The essential purpose of data preprocessing is to
manipulate and transform each dataset, making the information contained within them
more accessible and coherent [2, 3]. The data preprocessing process involves choosing
an outcome measure to evaluate, potential influencer variables, cleansing the data,
creating features and generating data sets to provide to beyond core for automated
analysis. Data preprocessing is an important step in the knowledge discovery process,
because quality decisions must be based on quality data. The main tasks of data
preprocessing are: Data Cleaning, Missing Values, Noisy Data, Data Integration, Data
Transformation and Feature Selection. [37]
A feature selection algorithm is a computational solution that is motivated by a
certain definition of relevance. However, the relevance of a feature as seen from the
inductive learning perspective may have several definitions depending on the objective
that is looked for. An irrelevant feature is not useful for induction, but not all relevant
features are necessarily useful for induction [34]. The existing selection algorithms
focus, mainly, on finding relevant features [4]. This is a process, in which the most
relevant characteristics are selected, improving the knowledge discovery database.
This paper is structured as follows: problem, methodology used, feature selection
algorithms, results, and conclusions.

2 Problem
Feature selection, applied as a data preprocessing stage to data mining, proves to be
valuable in that it eliminates the irrelevant features that make algorithms ineffective.
Sometimes the percentage of instances correctly classified is higher if a previous feature
selection is applied, since the data to be mined will be noise-free [5]. This is usually
attributed to the “curse of dimensionality” or to the fact that irrelevant features decrease
the signal to noise ratio. In addition, many algorithms become computationally
intractable when the dimensionality is high. [33]
Feature selection task is divided into four stages [6]: the first one determines the
possible set of attributes to perform the representation of the problem, then the subset
of attributes generated in step one is evaluated. Subsequently, it is examined whether
the selected subset satisfies the search criteria. These processes can be classified
differently depending on the stage in which we focus, in order to make this distinction
in three categories [7]: Filters, Wrappers [8, 9] and hybrids [10].
In Filters methods the selection procedure is performed independently of the
evaluation function. In these can be distinguished four different evaluation measures:
distance, information, dependence and consistency. Respective examples of each one
of these measures can be found in: [11 - 13]. Wrappers methods combine search in the
attribute space with the machine learning evaluating the set of attributes and choosing
the most appropriate. Hybrid models present the advantages that Filters and Wrappers
models provide. Since the feature selection is applicable to dissimilar real situations, it
is difficult to reach a consensus as to which is the best possible choice; this makes
possible multiple algorithms of this type [14].

3 Methodology
This section describes the type of scientific research used in the article along with the
research method and the development methodology. The type of scientific research
applied in this paper is descriptive-exploratory with an experimental approach.
According to the formal research process, a hypothetical - deductive method was used
in which a hypothesis was formulated, which through empirical validation was
validated through deductive reasoning. It was established, based on the
experimentation, a mechanism for weighting the algorithm evaluation indicators in
such a way that it was possible to evaluate said mechanism when changing the dataset.
The following tasks are defined to obtain the results, after applying the appropriate
algorithms in feature selection:
Collection, integration, and data preprocessing. In this phase, the collection and
integration of different datasets, data transformation; as the case may be, and cleaning
in order to eliminate noise.
Definition and application of tests of the algorithms used for feature selection.
Based on the tests performed with the synthetic data, the imputation and the evaluation
of the imputation indicators were applied to apply the algorithms.
Review of test results. The analysis of the data obtained in the allocation with the
algorithms was performed. Likewise, the complexity of the algorithms was calculated,
in order to determine their feasibility of implementation.
The basis for the development of this paper is based on the analysis and choice of
algorithms for feature selection. In this stage the case studies are of vital importance,
the conclusions of this research and machine learning algorithms used in feature
selection.
4 Feature selection algorithms
4.1 Decision trees
A decision trees are a representation in which each set of possible conclusions is
implicitly established by a list of known class samples [15]. Decision tree has a simple
form that efficiently classifies new data [16, 17].
These trees are considered as an important tool for data mining; compared to other
algorithms, decision trees are faster and more accurate [18]. Learning in a decision trees
is a method to approximate an objective function of discrete values, in which the
learning function is represented by a tree. These learning methods are the most popular
inductive inference algorithms and they have thriving application in various machine
learning tasks [19 - 21]. The theory of information provides a mathematical model
(equation 1) to measure the total disorder in a database.
𝑛𝑏 𝑛𝑏𝑐 𝑛𝑏𝑐
𝑖𝑠𝐴𝑣𝑒𝑟 = ∑𝑏1 ∗ ∑𝑐1 − 𝑙𝑜𝑔𝑐 ( ) (1)
𝑛𝑡 𝑛𝑏 𝑛𝑏

Where: nb is the number of instances of attribute b, nt is the total number of

instances, nbc is the number of instances of attribute b belonging to class c
4.2 Entropy measure for ranking features
Ranking methods may filter features leading to reduced dimensionality of the
feature space. All the process is based on eliminating n attributes, without losing the
basic characteristics of the dataset. This algorithm is based on the measure of similarity
S that is inversely proportional to the distance D between two instances of n dimensions
[22 - 24]. The distance D is small with instances close to 0 and large with instances
close to 1. When the features are numerical, the measure of similarity S of two instances
can be as shown in equation 2.
𝑆𝑖𝑗 = 𝑒 −𝛼𝐷𝑖𝑗 (2)

Where Dij is the distance between the instances Xij and Yij, and α a parameter
expressed in mathematic terms (equation 3).
−ln⁡(0.5)
𝛼= (3)
𝐷

D is the average distance between the samples in the dataset. In practice it is close
to 0.5. The Euclidean distance are calculated as follows (equation 4).
𝑋𝑖𝑘 −𝑋𝑗𝑘 2
𝐷𝑖𝑗 = √∑𝑛𝑘=1 ( ) (4)
𝑚𝑎𝑥𝑘 −𝑚𝑖𝑛𝑘

Where n is number of attributes, maxk and mink are the maximum and minimum
value used for the normalization of k-attributes. When the attributes are categorical, the
hamming distance is used (equation 5).
|𝑋𝑖𝑘 =𝑋𝑗𝑘 |
𝑆𝑖𝑗 = ∑𝑛𝑘=1 (5)
𝑛
Where |Xik = Xjk| is 1, otherwise it is 0. The distribution of all similarities for a given
data set is a characteristic of the organization and order of data in an n - dimensional
space. This organization may be more or less ordered. Changes in the level of order in
a data set are the main criteria for inclusion or exclusion of a feature from the feature
set; these changes may be measured by entropy [33]. The algorithm in question
compares the entropy for a set of data before and after deleting attributes. For a data set
of N instances, the measure of the entropy is (equation 6).
𝐸 = − ∑𝑁−1 𝑁
𝑖=1 ∑𝑗=𝑖+1(𝑆𝑖𝑗 log(𝑆𝑖𝑗 ) + (1 − 𝑆𝑖𝑗 ) log(1 − 𝑆𝑖𝑗 )) (6)

The steps of the algorithm (table 1) are based on sequential backward ranking, and
they have been successfully tested on several real world applications.
Table 1. Algorithm steps [33]
1: Start with the initial full set of feature F.
2: For each feature f ∈ F, remove one feature f from F and obtain a subset Ff. Find the
difference between entropy for F and entropy for all F f.
3: Let fk be a feature such that the difference between entropy for F and entropy for Ffk is
minimum.
4: Update the set of feature F = F − {f k}, where “ – ” is a difference operation on sets.
5: Repeat steps 2 – 4 until there is only one feature in F.
A ranking process may be stopped in any iteration, and may be transformed into a
process of selecting features, using the additional criterion mentioned in step 4. This
criterion is that the difference between entropy for F and entropy for Ff should be less
than the approved threshold value to reduce feature fk from set F. A computational
complexity is the basic disadvantage of this algorithm, and its parallel implementation
could overcome the problems of working with large data sets and large number of
features sequentially.
4.3 Estimation of distribution algorithms (EDAs)
EDA algorithm is a stochastic search technique based on population, which uses a
probability distribution model to explore candidates (instances) in a search space. [25,
26]. EDAs have been recognized as a strong algorithm to optimize. They have shown
a better performance in comparison with evolutionary algorithm, in problems where
these have not presented satisfactory results. This is mainly due to the explicit nature
of the relations or dependencies between the most important variables associated with
some particular problems that are estimated through probability distributions [27, 28].
Table 2 shows the EDA algorithm; in the first place an initial population of individuals
is generated. These individuals are evaluated according to an objective or aptitude
function. This evaluates how appropriate each individual is as a solution to the problem.
Based on this evaluation, a subset of the best individuals is selected. Thus, from this
subset it is learnt a probability distribution to be used to sample another population [27]
Table 2. EDA Algorithm pseudo-code [29]
Requires: Candidate size n, the number variables l and the cost function f(·).
1: θ ← initialize (l)
2: repeat
3: D ← sample (P(X; θ), n)
4: C ← select (D, f(·))
5: θ ← estimate (θ, C)
6: until P(X; θ) has converged
Outputs: Probability distribution P(X; θ)

This kind of genetic algorithms are stochastic search techniques that evolve a
probability distribution model from a pool of solution candidates, rather than evolving
the pool itself. The distribution is adjusted iteratively with the most promising (sub-
optimal) solutions until convergence. Hence, they are also known as Estimation of
Distribution Algorithms. The generic estimation procedure is shown in Algorithm 1.
Step (1) initializes the model parameters θ. Step (2) is the loop that updates the
parameters θ until convergence. Step (3) samples a pool S of n candidates from the
model. Step (4) ranks the pool according to a cost function f (·) and chooses the top-
ranked into B. Step (5) re-estimates the parameters θ from this subset of promising
solutions [32]. EDA [30, 31] approximately optimizes a cost function by building a
probabilistic model of a pool of promising sub-optimal solutions over a given search
space. For very-high dimensional search spaces, storing and updating a large population
of candidates may imply a computational burden in both time and memory.
4.4 Bootstrapping algorithm
An option for a suitable feature selection of is to perform an exhaustive search, but
this entails a high complexity that makes it inaccessible. The methods of feature
selection must perform a search among the candidates of subsets of attributes, these
may be: complete, sequential or random. The complete search also known as exhaustive
search, helps to find the optimal result according to the evaluation criteria used; the
problem is its high complexity (i.e. O(2n)) that makes it inappropriate in case the
number of attributes (n) is high. The random search does not ensure an optimal result,
starts with a subset of randomly selected attributes and proceeds, from it, with a
sequential search or randomly generating the rest of candidate subsets.
Bootstrapping algorithm replicates all of the classification experiments a large
number of times and estimates the solution using the set of these experiments. Because
this process is based on random selection, there is a probability that there will be some
data from the original set that is not used and others that are involved in more than one
subset. Bootstrapping algorithm is divided into four stages: generate subsets of
attributes, evaluate each subset, update the weights of each attribute and order them by
their weight [35].
In the first stage, the subsets of attributes are generated; each subset will contain a
maximum number of attributes that randomly select among the original dataset (there
cannot be the same attributes). Each subset is generated independently of the previous
one (here can be similar subsets). Both the generated subset number and the maximum
number of attributes contained in them will be established by the user.
The evaluation phase consists of classifying the original dataset with each of the
subsets of attributes generated in the previous stage. The result is to assign to each
subset a goodness, which is the percentage of success of each of the classifications.
In the update, the weights of the attributes of the original data set are assigned. The
weight of an attribute is the average of the benefits of the subsets that contain that
attribute.
The last step is to organize the attributes of form descendant according to their
weight. This phase generates a list that will be the classification returned by the process.
The difference between the bootstrapping algorithm and the exhaustive method is in
the first stage. For example, the exhaustive method with K=1 generates a subset for
each feature of the original set. However, if we choose a value of K=2, a subset is
generated for each pair of possible features. Therefore, this first step for an exhaustive
method depends on the value of K, whereas in the random method it depends on the
number of experiments and the maximum number of features. The table 3 show
bootstrapping algorithm, this algorithm divides into two tasks. First, RankingGenerate,
is the main function and has as input parameters: the classification method that is used
to evaluate each subset of features chosen U; the set of features to be treated X; the
number of experiments to be performed Ne and the maximum number of features that
will intervene in each Na experiment. This function generates a single output parameter
L, which corresponds to the feature ranking obtained by applying the features selection
algorithm. For this purpose, the RankingGenerate function sort the attributes according
to their weight. To calculate this value, for each feature ai, the average of the success
percentages resulting from applying the classification method U on those subsets that
contain the feature ai is calculated. Second, to obtain these subsets of attributes we use
the SubsetGenerate function, where each subset is chosen randomly with an equally
random size (greater than 1 and less than or equal to Na) so that there is no duplicate
attribute in the same subset
Table 3. Bootstrapping algorithm pseudo-code
Requires: Criteria evaluate U, features X, Experiments number Ne, Features number Na,
Feature list L.
Function RankingGenerate
1: S ← SubsetGenerate (X, Ne, Na)
2: while features subset Si ϵ S
3: C ← Evaluate(Si, C)
4: Feature Update Si ϵ S
End while
5: L ← Sort(X)
End RankingGenerate Function
Function SubsetGenerate
1: for i ← 1 until Ne
2: n ← GenerateRandomNumber (1, Na)
3: Si = ChooseFeatures (X, n)
4: L ← L + Si
End for
End SubsetGenerate Function
In [36] the authors motivate by the fact that the variability from randomization can
lead to inaccurate outputs, they propose a deterministic approach. First, they establish
several computational complexity results for the exact bootstrap method, in the case of
the sample mean. Second, it shows the first efficient, deterministic approximation
algorithm (FPTAS) for producing exact bootstrap confidence intervals which, unlike
traditional methods, has guaranteed bounds on the approximation error. Third, develop
a simple exact algorithm for exact bootstrap confidence intervals based on polynomial
multiplication. Last, provide empirical evidence involving several hundred (and in
some cases over one thousand) data points that the proposed deterministic algorithms
can quickly produce confidence intervals that are substantially more accurate compared
to those from randomized methods, and are thus practical alternatives in applications
such as clinical trials.

5 Results
5.1 Decision trees
You can see that they have totally different behaviors and the reasons are:
Soybean dataset: shows behavior in which a low percentage of feature is selected,
the highest percentage of selection and the maximum of selected attributes does not
exceed 70%. This problem has been called the curse of dimensionality.
Chess dataset: the behavior of these data is totally opposite to the Soybean dataset,
since it covers the largest amount of attribute space. Means that most data is relevant
and only begins to be deleted with a high features selection percentage.
5.2 Entropy measure for ranking features
For test, this algorithm we used a dataset of four features (X1, X2, X3, and X4) with
one thousand instances. Features contain categorical data; therefore, we are used the
hamming distance to compute the similarity (equation 5) between the instances and
then the entropy is computed (equation 6). The result generates that the feature X3 is the
least relevant; since in computing the difference between the total entropy and the
entropy without the feature three, it is the one closest to 0. Therefore, X3 must be
removed from the dataset.
5.3 Estimation Distribution Algorithms (EDAs)
We used one dataset applied to solve the classical OneMax problem with 100
variables, using a population size of 30 and 1000 maximum number of evaluations, and
30 candidates per iteration. For the execution of the EDAs algorithm, used Orange Suit
with the widget Goldenberry. The result show in the table 4 [38]. The OneMax is a
traditional test problem for evolutionary computation. It involves binary bitstrings of
fixed length. An initial population is given. The objective is to evolve some bitstrings
to match a prespecified bitstring [39].
Table 4. EDA applied result
Best:{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,
1,0,0,0,0,1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,
1,1,1,1,1}.
Cost: 89, Evals: 464, Argmin: 22, Argmax: 459, Min val: 44, Max val: 89, Mean:
68.4590517241, Stdev: 10.5445710397

5.4 Bootstrapping algorithm

DNA dataset: Results focused on the time it uses bootstrapping algorithm is directly
related to the number of subsets selected; this is where the algorithm uses most of the
time. Since the number of attributes is high, the number of subsets to generate affects
the performance of this. The algorithm generates a ranking and chooses the percentage
of attributes that are in better positions according to the ranking (table 5).
Table 5. DNA dataset test
Selection percentage Features selection Quantity Time
0% 90 1 144 seg.
5% 6,16,29,74,92,115, 127,142, 163 9 143 seg.
30% 1,2,7,10,15,17,20,21,30,39,40,41,42 54 145 seg.
44,47,48,52,56,61,63,64,65,66,72,73,
75,84,85,95,97,98,101,108,109,110,
115,116,123,125,126,128,136,139,
144,146,148,149,152,155,158,165,168,
176,177
80% 1-9,11-28,30-42,44-55,59-62, 64,66, 144 145 seg.
67-71,73-76,78-80,82-94, 96,97,
99-105,107-119,122,125-131,134,137,
139-145,147-151,154-156,158-160,
162-164,166-177,179,180
CHESS dataset: Feature selection with bootstrapping algorithm is very similar to
the NURSERY dataset, in that most have a very similar ranking, because the entire
space of possibilities is covered. The big difference between these two datasets is that
the CHESS occupies a considerable time in the selection, since, first, it has more
instances, second, each attribute has 8 possible values, and third the class has 18
possible values that increases the benefits of the naive bayes classifier (table 6).
Table 6. Chess dataset test
Selection percentage Features selection Quantity Time
50 2,3,5 3 13,7 seg.
80 1,2,3,4 4 13,7 seg.
90 1,2,3,4,5 5 13,9 seg.
NURSEY dataset: The test with a selection percentage of 80% eliminated the
features 6 and 7, because the Naive bayes classifier provided these features with the
lowest benefits, thus, when generating the ranking based on the weights granted
according to this classifier, the dataset presents a deletion of the features mentioned
above, these represent little relevant attributes, in the case of the finance feature, which
contains the lowest number of classes in the set of attributes and next to social represent
the last places in the ranking generated by the algorithm, which corresponds to the
remaining 20% of the number of features of the dataset (table 7).
Table 7. Chess dataset test
Selection percentage Features selection Quantity Time
30 1,4 2 2.6 seg.
80 1,2,3,4,5,8 6 3 seg.

6 Conclusions
Decision trees are used as an algorithm to feature selection are a good option.
However, it must be taken into account that: the data set must be categorical, only
applies for predictive problems, which limits the field of applications, and if the dataset
is incomplete, the selection is not considered as good.
Feature selection based entropy measure for ranking features is applicable only to
descriptive tasks, limiting the field of application just like decision trees. Its main
weakness lies in the high complexity since it makes a comparison combining all the
instances. As for the EDAs, these enjoy a good reputation within the feature selection
algorithms; however, it has some weaknesses, such as: redundancy in generating the
dependency trees, in free code applications has only been done with languages
interpreters, and there is limited evidence of having used multivariate data.
The concept of randomness used by the Bootstrapping algorithm for the selection
of features establishes the possibility that, due to the very concept of randomness, the
algorithm presents unexpected results at some point, as opposed to its speed of
execution. The ideal would be to use algorithms that indicate if the result generated by
the algorithm is acceptable. It is emphasized that the Bootstrapping algorithm has a
good behavior with large volumes of dataset.

References
1. D. Larose, Data Mining: Methods and Models. (USA), Wiley-Interscience (2006) 1- 3.
2. D. Pyle, Data Preparation for Data Mining. (USA), Morgan Kaufmann Publisher (1999) 15
- 19.
3. P. Bradley, O. Mangasarian, Feature selection via concave minimization and support vector
machine. (USA), Journal Machine learning ICML (1998) 82 - 90.
4. Y. Lei, L. Huan, Efficient Feature Selection via Analysis of Relevance and Redundancy.
(USA), Journal Machine Learning Research 5 (2004) 1205 - 1224.
5. I. Guyon, A. Elissee, An introduction to variable and feature selection. (USA), Journal
Machine learning research 3 (2003) 1157 - 1182.
6. M. Dash, H. Liu, Feature Selection for Classification. (USA), Journal Intelligent Data
Analysis 1 (3) (1996) 131 - 156.
7. H. Liu, Y. Lei, Toward Integrating Feature Selection Algorithms for Classification and
Clustering. (USA), IEEE Trans. on Knowledge and Data Engineering 17 (4) (2005) 491 -
502.
8. R. Kohavi, G. John, Wrappers for Feature Subset Selection. (USA), Artificial Intelligence 97
(12) (1997) 273 - 324.
9. G. Jennifer, Feature Selection for Unsupervised Learning. (USA), J. Mach. Learn. Res. 5
(2004) 845 - 889.
10. S. Das, Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection. (USA), Proc.
18th Intl Conf. Machine Learning (2001) 74 - 81.
11. C. Cardie, Using Decision Trees to Improve Case-Based Learning. (USA), Proc. 10th Intll
Conf. Machine Learning, P. Utgo, ed. (1993) 25 - 32.
12. A. Mucciardi, E. Gose, A comparison of Seven Techniques for Choosing Subsets of Pattern
Recognition. (USA), IEEE Trans. Computer 20 (1971) 1023 - 1031.
13. R. Ruiz, J. Riquelme, J. Aguilar-Ruiz, Projection-based measure for efficient feature
selection. (USA), Journal of Intelligent and Fuzzy System 12 (2003) 175 - 183.
14. I. Pérez, R. Sánchez, Adaptación del método de reducción no lineal LLE para la selección
de atributos en WEKA. (Cuba), III Conferencia Internacional en Ciencias Computacionales
e Informáticas (2016) 1 - 7.
15. P. Winston, Inteligencia artificial. (USA), Addison Wesley (1994) 455 - 460.
16. S. Chourasia, Survey paper on improved methods of ID3 decision tree classification.
(USA), International Journal of Scienti c and Research Publications (2013) 1 - 4.
17. J. Rodríguez, Fundamentos de minería de datos. (Colombia), Fondo de publicaciones de la
Universidad Distrital Francisco José de Caldas (2010) 63 - 64.
18. Changala, Ravindra, Annapurna, Gummadi, G. Yedukondalu, U. N. P. G. Raju.,
Classification by decision tree induction algorithm to learn decision trees from the
classlabeled training tuples. (USA), International Journal of Advanced Research in
Computer Science and Software Engineering 2 (4) (2012) 427 - 434.
19. T. Michell, Machine learning. (USA), McGraw Hill (1997) 50 - 56.
20. J. Han, M. Kamber. Data mining: concepts and techniques. (USA), McGraw Hill 3 (2012)
331 - 336.
21. S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach. (USA), Prentice Hall
(2012) 531 - 540.
22. M. Kantardzic. Data mining: concepts, models, methods and algorithms. (USA), IEEE
Press Wiley-Interscience (2003) 46 - 48.
23. H. Liu, H. Motoda, Feature selection for knowledge discovery and data mining. (USA),
Kluwer Academic Publisher 2 (2000) 30 - 35.
24. H. Liu, H. Motoda, Feature extraction, construction and selection. A data mining
perspective. (USA), Kluwer Academic Publisher (2000) 20 - 28.
25. P. Larrañaga, J. Lozano, Estimation of Distribution Algorithms. A New Tool for
Evolutionary Computation. (USA), Kluwer Academic Publishers (2002) 1 - 2.
26. M. Pelikan, K. Sastry, Initial-population bias in the univariate estimation of distribution
algorithm. (USA), Proceedings of the 11th Annual Conference on Genetic and
Evolutionary Computation, GECCO '09 11 (2002) 429 - 436.
27. R. Pérez, A. Hernández, Un algoritmo de estimación de distribuciones para el problema de
secuencia-miento en configuración jobshop. (Mexico), Communication Del CIMAT 1
(2015) 1 - 4.
28. H. Mûhlenbein, G. Paa, From recombination of genes to the estimation of distributions I.
Binary parameters. (USA), Parallel Problem Solving from Nature - PPSN 4. Doi: 178–187
29. N. Rodríguez, Feature Relevance Estimation by Evolving Probabilistic Dependency
Networks and Weighted Kernel Machine. (Colombia), A thesis submitted to the District
University Francisco José de Caldas in fulﬁllment of the requirements for the degree of
Master of Science in Information and Communications (2013) 3 – 4.
30. E. Bengoetxea, P. Larrañaga, I. Bloch, and A. Perchant, Estimation of distribution
algorithms: A new evolutionary computation approach for graph matching problems. In
Energy Minimization Methods in Computer Vision and Pattern Recognition, (Germany),
volume 2134 of Lecture Notes in Computer Science (2001) 454 - 469.
31. M. Pelikan, K. Sastry, and E. Cantú-Paz, Scalable Optimization via Probabilistic Modeling:
From Algorithms to Applications. (USA), Springer-Overflag (2006).
32. N. Rodríguez and S. Rojas–Galeano. Discovering feature relevancy and dependency by
kernel-guided probabilistic model-building evolution. (USA). BioData Mining (2017)
10:12 DOI 10.1186/s13040-017-0131-y
33. M. Kantardzic, Data mining: concepts, models, methods and algorithms. Second edition.
(USA), IEEE Press Wiley-Interscience (2003) 68 - 70.
34. R. A. Caruana and D. Freitag. How useful is Relevance? Technical report, fall’94 AAAI
Symposium on Relevance, New Orleans, (1994).
35. J. Aguilar, and N. Díaz, Selección de atributos relevantes basada en bootstrapping.
(España), Actas del III Taller de Minería de Datos y Aprendizaje (TAMIDA’2005). (2005),
21 – 30.
36. D. Bertsimas and B. Sturt, Computation of exact bootstrap confidence intervals:
complexity and deterministic algorithms. Optimization Online an e-print site for the
optimization community. (2017).
37. H. Barrera, J. Correa y J. Rodríguez, Prototipo de software para el preprocesamiento de
datos - UDClear. IV Simposio Internacional de Sistemas de Información e Ingeniería de
Software en la Sociedad del Conocimiento, libro de actas volumen 1, ISBN 84-690-0258-
9. (2006).
38. N, Rodríguez and S. Rojas, Goldenberry: EDA Visual Programming in Orange Proceeding
of the fifteenth annual conference companion on Genetic and evolutionary computation
conference companion (GECCO). (2013) pp. 1325-1332.
39. D. H. Wood, J. Chen, E. Antipov, B. Lemieux and W. Cedeño, A design for DNA
computation of the OneMax problem. Springer-Verlag, Soft Computing, (2001) Volume
5, Issue 1, 19–24.

The AI Wealth Creation Blueprint PDF
67% (3)
The AI Wealth Creation Blueprint PDF
50 pages
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
100% (8)
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
148 pages
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
Functional Cognitive Tasks C 2022 The Adult Speech Therapy Workbook
100% (2)
Functional Cognitive Tasks C 2022 The Adult Speech Therapy Workbook
23 pages
Ain't It Fun - Paramore
No ratings yet
Ain't It Fun - Paramore
2 pages
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
Grokking Machine Learning v7 MEAP
100% (9)
Grokking Machine Learning v7 MEAP
280 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
100% (24)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
75 Productivity Hacks - System Sunday
100% (6)
75 Productivity Hacks - System Sunday
75 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
I Want It That Way Chords
No ratings yet
I Want It That Way Chords
3 pages
Interface Zero (OEF) (2019)
100% (14)
Interface Zero (OEF) (2019)
273 pages
Survey 2006
No ratings yet
Survey 2006
15 pages
Automatic Feature Subset Selection Using Genetic Algorithm For Clustering
No ratings yet
Automatic Feature Subset Selection Using Genetic Algorithm For Clustering
5 pages
Module 3
No ratings yet
Module 3
33 pages
Improvised Method of FAST Clustering Based Feature Selection Technique Algorithm For High Dimensional Data
No ratings yet
Improvised Method of FAST Clustering Based Feature Selection Technique Algorithm For High Dimensional Data
6 pages
10_chapter 3
No ratings yet
10_chapter 3
15 pages
Independent Feature Elimination in High Dimensional Data: Empirical Study by Applying Learning Vector Quantization Method
No ratings yet
Independent Feature Elimination in High Dimensional Data: Empirical Study by Applying Learning Vector Quantization Method
6 pages
Clustering High-Dimensional Data Derived From Feature Selection Algorithm.
No ratings yet
Clustering High-Dimensional Data Derived From Feature Selection Algorithm.
6 pages
PCA-based Algorithm For Constructing Ensembles of Feature Ranking Filters
No ratings yet
PCA-based Algorithm For Constructing Ensembles of Feature Ranking Filters
6 pages
Fusion of Feature Selection With Symbolic Approach For Dimensionality Reduction
No ratings yet
Fusion of Feature Selection With Symbolic Approach For Dimensionality Reduction
4 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
Test
No ratings yet
Test
4 pages
Knowledge Mining Using Classification Through Clustering
No ratings yet
Knowledge Mining Using Classification Through Clustering
6 pages
Kernels, Model Selection and Feature Selection
No ratings yet
Kernels, Model Selection and Feature Selection
5 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
Complete Search For Feature Selection in Decision Trees
No ratings yet
Complete Search For Feature Selection in Decision Trees
34 pages
Graph Autoencoder-Based Unsupervised Feature Selection With Broad and Local Data Structure Preservation
No ratings yet
Graph Autoencoder-Based Unsupervised Feature Selection With Broad and Local Data Structure Preservation
28 pages
A Genetic Algorithm For Classification
No ratings yet
A Genetic Algorithm For Classification
6 pages
PR Assignment 02 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 02 - Seemal Ajaz (206979)
5 pages
Clustering in Data Mining
No ratings yet
Clustering in Data Mining
5 pages
Ds Module 5
No ratings yet
Ds Module 5
49 pages
The Negative Impact of Missing Value Imputation in Classification of Diabetes Dataset and Solution For Improvement
No ratings yet
The Negative Impact of Missing Value Imputation in Classification of Diabetes Dataset and Solution For Improvement
8 pages
Data Science Technical Interview Questions
No ratings yet
Data Science Technical Interview Questions
24 pages
A Novel Collaborative Filtering Model Based On Combination of Correlation Method With Matrix Completion Technique
No ratings yet
A Novel Collaborative Filtering Model Based On Combination of Correlation Method With Matrix Completion Technique
8 pages
Genetic Algorithm Based Semi-Feature Selection Method: Hualong Bu Shangzhi Zheng, Jing Xia
No ratings yet
Genetic Algorithm Based Semi-Feature Selection Method: Hualong Bu Shangzhi Zheng, Jing Xia
4 pages
Comparison and Selection
No ratings yet
Comparison and Selection
10 pages
Feature Gradients: Scalable Feature Selection Via Discrete Relaxation
No ratings yet
Feature Gradients: Scalable Feature Selection Via Discrete Relaxation
9 pages
Entropy-Based Algorithm For Discretization: April 2011
No ratings yet
Entropy-Based Algorithm For Discretization: April 2011
9 pages
Comparative_Study_on_Normalization_Procedures_for_Cluster_Analysis_of_Gene_Expression_Datasets_deSouto2008b
No ratings yet
Comparative_Study_on_Normalization_Procedures_for_Cluster_Analysis_of_Gene_Expression_Datasets_deSouto2008b
6 pages
Unit 4 Data warehousing and Data mining
No ratings yet
Unit 4 Data warehousing and Data mining
15 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
Classification Algorithm in Data Mining: An
No ratings yet
Classification Algorithm in Data Mining: An
6 pages
Using Data Complexity Measures For Thresholding in Feature Selection Rankers
No ratings yet
Using Data Complexity Measures For Thresholding in Feature Selection Rankers
11 pages
Feature Selection Based On Class-Dependent Densities For High-Dimensional Binary Data
No ratings yet
Feature Selection Based On Class-Dependent Densities For High-Dimensional Binary Data
13 pages
Feature Selection Based On Fuzzy Entropy
No ratings yet
Feature Selection Based On Fuzzy Entropy
5 pages
Handwritten Digit Recognition by Combined Classifiers
No ratings yet
Handwritten Digit Recognition by Combined Classifiers
7 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
Nearest Neighbour Based Outlier Detection Techniques
No ratings yet
Nearest Neighbour Based Outlier Detection Techniques
5 pages
Benchmarking Attribute Selection Techniques For Discrete Class Data Mining
No ratings yet
Benchmarking Attribute Selection Techniques For Discrete Class Data Mining
16 pages
Assignment 3
No ratings yet
Assignment 3
8 pages
Local Search Genetic Algorithm-Based Possibilistic Weighted Fuzzy C-Means For Clustering Mixed Numerical and Categorical Data PDF
No ratings yet
Local Search Genetic Algorithm-Based Possibilistic Weighted Fuzzy C-Means For Clustering Mixed Numerical and Categorical Data PDF
16 pages
Discovering Stock Price Prediction Rules Using Rough Sets
No ratings yet
Discovering Stock Price Prediction Rules Using Rough Sets
19 pages
Journal On Decision Tree
No ratings yet
Journal On Decision Tree
5 pages
Feature Ranking Methods Based On Information Entropy With Parzen Windows
No ratings yet
Feature Ranking Methods Based On Information Entropy With Parzen Windows
9 pages
Estrategias de Control - Procesos
No ratings yet
Estrategias de Control - Procesos
9 pages
Feature Selection Extraction
No ratings yet
Feature Selection Extraction
24 pages
Literature Review On Feature Subset Selection Techniques
No ratings yet
Literature Review On Feature Subset Selection Techniques
3 pages
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
No ratings yet
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
8 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
49 pages
Local Feature Selection For Data Classification
No ratings yet
Local Feature Selection For Data Classification
12 pages
Hybrid Feature Selection
No ratings yet
Hybrid Feature Selection
8 pages
A Data Pre Processing
No ratings yet
A Data Pre Processing
7 pages
MRMRKKT PDF
No ratings yet
MRMRKKT PDF
5 pages
3038-Article Text-5729-1-10-20210418
No ratings yet
3038-Article Text-5729-1-10-20210418
6 pages
DW&DM(Unit -4)
No ratings yet
DW&DM(Unit -4)
9 pages
EDAB Module 5 Singular Value Decomposition (SVD)
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
58 pages
Unit - 2 ML notes
No ratings yet
Unit - 2 ML notes
14 pages
1 s2.0 S1877050923018549 Main
No ratings yet
1 s2.0 S1877050923018549 Main
5 pages
Genetic K-Means Algorithm: Conf., 1987, Pp. 50-58
No ratings yet
Genetic K-Means Algorithm: Conf., 1987, Pp. 50-58
7 pages
Basic Sens Analysis Review PDF
No ratings yet
Basic Sens Analysis Review PDF
26 pages
Search Algorithm: Fundamentals and Applications
From Everand
Search Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Lawsuit Against Musk and Tesla Over AI Stuff
50% (2)
Lawsuit Against Musk and Tesla Over AI Stuff
76 pages
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Attention Is All You Need
50% (2)
Attention Is All You Need
11 pages
Test Ninjas Digital Sat Math Cheat Sheet
100% (4)
Test Ninjas Digital Sat Math Cheat Sheet
38 pages
Sudoku Theory
No ratings yet
Sudoku Theory
13 pages
AI, Machine Learning & Big Data 2024
No ratings yet
AI, Machine Learning & Big Data 2024
274 pages
AI Money Machine
100% (2)
AI Money Machine
267 pages
Rationality From AI To Zombies
86% (7)
Rationality From AI To Zombies
1,813 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
Mythic Magazine #009
100% (3)
Mythic Magazine #009
27 pages
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
No ratings yet
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
25 pages
Improved Statistical Test
87% (171)
Improved Statistical Test
20 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
I, Human - AI, Automation, and The Quest To Reclaim What Makes Us Unique
No ratings yet
I, Human - AI, Automation, and The Quest To Reclaim What Makes Us Unique
205 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
RD-Sharma-Solution-Class-9-Maths-Chapter-23-Graphical-Representation-of-Statistical-Data-Exercise-23.3
No ratings yet
RD-Sharma-Solution-Class-9-Maths-Chapter-23-Graphical-Representation-of-Statistical-Data-Exercise-23.3
4 pages
The Teaching of Mathematics
No ratings yet
The Teaching of Mathematics
15 pages
Numerical Ability 1
No ratings yet
Numerical Ability 1
3 pages
Digital Electronics Questions
No ratings yet
Digital Electronics Questions
8 pages
S5 Bot Math P2 Term Ii-Chebet Eli
No ratings yet
S5 Bot Math P2 Term Ii-Chebet Eli
2 pages
Modelling and Dynamic Simulation of Processes With MATLAB'. An Application of A Natural Gas Installation in A Power Plant
No ratings yet
Modelling and Dynamic Simulation of Processes With MATLAB'. An Application of A Natural Gas Installation in A Power Plant
12 pages
VL10b VL02N ZMM REMOVE LGORT PGI
No ratings yet
VL10b VL02N ZMM REMOVE LGORT PGI
4 pages
Pulse Modulation Consists Essentially of Sampling Analog Information Signals and Then Converting
No ratings yet
Pulse Modulation Consists Essentially of Sampling Analog Information Signals and Then Converting
40 pages
Handout DSP Ece Eee f434
No ratings yet
Handout DSP Ece Eee f434
2 pages
207 Python Programming Exercises Volume 1 - Become A Pro Python Developer
No ratings yet
207 Python Programming Exercises Volume 1 - Become A Pro Python Developer
199 pages
PSM1 - Jan 2022 QP PDF
No ratings yet
PSM1 - Jan 2022 QP PDF
28 pages
Basics in Number Theory
No ratings yet
Basics in Number Theory
17 pages
® Iit-Jee: 2024-25 Enthusiast Course Phase-I (A), I (B), I & Ii
No ratings yet
® Iit-Jee: 2024-25 Enthusiast Course Phase-I (A), I (B), I & Ii
1 page
S13 Divide and Conquer Adaptive Components-Tim Waldock - Handout
No ratings yet
S13 Divide and Conquer Adaptive Components-Tim Waldock - Handout
70 pages
Sismic-Forces Cpe Inen
No ratings yet
Sismic-Forces Cpe Inen
22 pages
Java String Methods Cheat Sheet
No ratings yet
Java String Methods Cheat Sheet
18 pages
1 s2.0 S1364815222002444 Main
No ratings yet
1 s2.0 S1364815222002444 Main
17 pages
Ma 8452 SNM Hostel Coaching Sheet
No ratings yet
Ma 8452 SNM Hostel Coaching Sheet
7 pages
Ship Structure Design and Nomeclature
100% (1)
Ship Structure Design and Nomeclature
289 pages
Notes On Econometrics I: Grace Mccormack
No ratings yet
Notes On Econometrics I: Grace Mccormack
50 pages
econometrics project- Maternal Mortality Analysis
No ratings yet
econometrics project- Maternal Mortality Analysis
23 pages
Quadratic Equation Previous Year Question - 01 (1) (9 Files Merged)
No ratings yet
Quadratic Equation Previous Year Question - 01 (1) (9 Files Merged)
37 pages
W-Hu - MIT
No ratings yet
W-Hu - MIT
67 pages
DS PPT - 29.03.2023
No ratings yet
DS PPT - 29.03.2023
14 pages
Pnge 333 HW06
No ratings yet
Pnge 333 HW06
9 pages
Interfaces: Multiple Inheritance
No ratings yet
Interfaces: Multiple Inheritance
30 pages
Rotation Work Sheets PDF
No ratings yet
Rotation Work Sheets PDF
20 pages
Image Processing VII Semester Elective Bca Syllabus
No ratings yet
Image Processing VII Semester Elective Bca Syllabus
2 pages
Addition and Subtraction Workbook Grade 1, 5 Minute Drill
0% (1)
Addition and Subtraction Workbook Grade 1, 5 Minute Drill
152 pages
L-2005-05-Divide and Conquer
No ratings yet
L-2005-05-Divide and Conquer
25 pages