Unit I DM
Unit I DM
Unit I DM
UNIT - I
UNIT - I
4. Data mining
Apply algorithms to transformed data an extract patterns
5. Pattern Interpretation/evaluation
Pattern Evaluation- Evaluate the interestingness of resulting patterns or apply
interestingness measures to filter out discovered patterns
Knowledge presentation- present the mined knowledge- visualization
techniques can be used
6. DataVisualization
Graphical-bar charts,pie charts histograms Geometric-boxplot, scatter plot
40
35
30
25
20
15
10
5
0
10000 30000 50000 70000 90000
Data Transformation
Genetic Algorithm
The genetic algorithm is a method for solving both constrained and unconstrained
optimization problems that is based on natural selection, the process that drives
biological evolution.
The genetic algorithm repeatedly modifies a population of individual solutions.
At each step, the genetic algorithm selects individuals at random from the current
population to be parents and uses them to produce the children for the next
generation. Eg: Learning Robots
Support Vector Machine
Support Vector Machine (SVM) is a supervised machine learning algorithm
capable of performing classification, regression and even outlier detection.
The linear SVM classifier works by drawing a straight line between two classes.
All the data points that fall on one side of the line will be labelled as one class
and all the points that fall on the other side will be labelled as the second. E.g.:
text recognition
Rough set techniques
Rough set theory has been a methodology of database mining or knowledge
discovery in relational databases.
It is a new area of uncertainty mathematics closely related to fuzzy theory.
Rough set approach is used to discover structural relationship within imprecise
and noisy data.
OTHER MINING PROBLEMS
Sequence Mining- discovering interesting patterns
E.g.: DNA, protein, customer purchase history