0% found this document useful (0 votes)

1K views

Introduction To Data Mining

This document provides an introduction to data mining and knowledge discovery in databases (KDD). It discusses how data mining can extract meaningful patterns and knowledge from large amounts of data. The key applications of data mining include credit ratings, targeted marketing, fraud detection, and customer relationship management. Data mining uses algorithms to classify data, predict future trends, cluster similar data, and discover relationships between data. Common algorithms include classification, prediction, clustering, and association rule mining. The goal of data mining is to extract valid, novel, useful, and understandable patterns from data.

Uploaded by

Avijit Karan

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views

Introduction To Data Mining

Uploaded by

Avijit Karan

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 69

An Introduction to Data Mining

1
Understanding Data Mining
 Data growing at a phenomenal rate – almost double every
year
 Data kept in files but mostly in Relational Data bases
 Large operational databases usually for OLTP applications e.g.
Banking
 Large pool of past historical data
 Can we do something with the large amount of data?
 Find meaningful and interesting information from the data
 Can standard SQL do it? No, we need different approach and
algorithms
 Data Mining is the answer

2
Why Now?

 Data is being produced

 Data is being warehoused (Data Warehouse)
 The computing power is available
 The computing power is affordable
 The competitive pressures are strong
 Commercial products are available

3
Data Mining
 Credit ratings/targeted marketing
 Given a database of 100,000 names, which persons are the least likely
to default on their credit cards?
 Identify likely responders to sales promotions
 Fraud detection
 Which types of transactions are likely to be fraudulent, given the
demographics and transactional history of a particular customer?
 Customer relationship management
 Which customers are likely to be the most loyal, and which are most
likely to leave for a competitor?
Data Mining helps extract such information

4
What is Data Mining?
 Extracting or Mining knowledge from large data set to find
patterns that are
 valid: hold on new data with some certainty
 novel: non-obvious to the system
 useful: should be possible to act on the item
 understandable: humans should be able to interpret the pattern
 Also known as Knowledge Discovery in Databases
 Example
 Which items are purchased together in a retail store?

 Fraudulent usage of credit cards – detect purchase of

extremely large amount compared to regular purchases

5
Applications
 Banking: loan/credit card approval
 predict good customers based on old customers
 Customer relationship management
 identify those who are likely to leave for a competitor.
 Targeted marketing
 identify likely responders to promotions
 Fraud detection - telecommunications, financial
transactions
 from an online stream of event identify fraudulent events
 Manufacturing and production
 automatically adjust knobs when process parameter changes

6
Applications (continued)

 Medicine: disease outcome, effectiveness of

treatments
 analyze patient disease history: find relationship between
diseases
 Molecular/Pharmaceutical: identify new drugs
 Scientific data analysis:
 identify new galaxies by searching for sub clusters
 Web site/store design and promotion:
 find affinity of visitor to pages and modify layout

7
Data Mining Versus KDD

 Knowledge Discovery in databases (KDD) is the

process of finding useful information and
patterns in data
 Data Mining is the use of algorithms to extract
the information and pattern derived by the KDD
process
 Often these two terms are used interchangeably
 KDD is a process that involves different steps

8
The KDD process
 Problem formulation
 Data collection
 Result evaluation and Visualization:
 Data cleaning : Remove noise and inconsistent data
 Data integration : Data from multiple sources combined
 Data selection: Select data relevant to the mining task
 Data transformation : Transform data (Summary, aggregation or
consolidate) appropriate for mining
 Data Mining : Find interesting pattern
 Pattern Evaluation : Identify truly interesting pattern
 Presentation : GUI

Knowledge discovery is an iterative process 9

Mining on what kind of data?

 Large Data Volume

 Relational Database
 Data Warehouse
 Flat files
 Web
 Transaction Database
 Object Oriented or Object Relational Database

10
Data Mining works with
Warehouse Data
 Data Warehousing provides the
Enterprise with a memory

Data Mining provides the

Enterprise with intelligence

11
Data Mining Algorithms

 Data mining involves different algorithms to

accomplish different tasks
 All these algorithms attempt to fit a model to the
data
 The algorithms examine the data and determine
a model that is a closest to the characteristics of
the data being examined

12
Some basic data mining tasks
 Predictive: Predict values of data using known
result found from different data (historical)
 Regression
 Classification
 Time series analysis
 Descriptive: Identifies patterns or relationship in
data
 Clustering / similarity matching
 Association rules and variants
 Summarization
 Sequence Discovery
13
Supervised Learning vs. Unsupervised
Learning
 Supervised learning (classification)
 Supervision: The training data (observations,
measurements, etc.) are accompanied by labels indicating
the class of the observations
 New data is classified based on the training set
 Unsupervised learning (clustering)
 The class labels of training data is unknown
 Given a set of measurements, observations, etc. with the
aim of establishing the existence of classes or clusters in
the data

14
Classification
 Maps data into predefined classes or groups
 Given old data about customers and payments,
predict new applicant’s loan eligibility.

Previous Classifier Decision rules

customers
Salary > 5 L
Age Good/
Salary
Profession
Prof. = Exec
bad
Location
Customer type
New applicant’s data

15
Classification

 Example I : general classification

 Credit Card companies must determine whether to
authorize credit card purchases. Each purchase is
placed in one of the four categories:
 Authorize
 Ask for further identification before authorization
 Do not authorize
 Do not authorize but contact police

16
Classification

 Example II : Pattern recognition

 An airport security screening station is used to
determine if passengers are potential criminals or
terrorists
 To do this face of each passenger is scanned and its
basic pattern (distance between eyes, size and shape
of mouth, shape of head etc.) is identified
 The pattern is compared to the entries in a database to
see if it matches any patterns of known offenders

17
Classification vs. Prediction
 Classification
 predicts categorical class labels (discrete or nominal)

 classifies data (constructs a model) based on the training set

and the values (class labels) in a classifying attribute and

uses it in classifying new data
 Prediction
 models continuous-valued functions, i.e., predicts unknown

or missing values
 Typical applications
 Credit approval

 Target marketing

 Medical diagnosis

 Fraud detection

18
Model Construction (Process I)
Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’

19
Classification
Use the Model in Prediction (Process II)

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes

20
Another example

21
Association Rules and
Market Basket Analysis

22
What is Market Basket Analysis?
 Customer Analysis
 Market Basket Analysis uses the information about
what a customer purchases to give us insight into
who they are and why they make certain purchases.

 Product Analysis
 Market basket Analysis gives us insight into the
merchandise by telling us which products tend to
be purchased together and which are most
amenable to purchase.

23
Market Basket Example

? Where should detergents be placed in the

Store to maximize their sales?

? Are window cleaning products purchased

when detergents and orange juice are
bought together?

? Is soda typically purchased with bananas?

Does the brand of soda make a difference?

? How are the demographics of the

neighborhood affecting what customers
are buying?

24
Association Rules
 There has been a considerable amount of research in the area of Market Basket Analysis.
Its appeal comes from the clarity and utility of its results, which are expressed in the
form association rules.
 Given
 A database of transactions
 Each transaction contains a set of items

 Find all rules X->Y that correlate the presence of one set of items X with another set of
items Y
 Example: When a customer buys bread and butter, they buy milk 85% of the time

+
25
Results: Useful, Trivial, or Inexplicable?
 While association rules are easy to understand, they are not always useful.

Useful: On Fridays convenience store customers often

purchase diapers and beer together.
Trivial: Customers who purchase maintenance
agreements are very likely to purchase large
appliances.
Inexplicable: When a new Super Store opens, one of the
most commonly sold item is light bulbs.

26
How Does It Work?
Grocery Point-of-Sale Transactions

Customer Items
1 Orange Juice,
juice, Soda
2 Milk, Orange Juice, Window Cleaner
3 Orange Juice, Detergent
4 Orange Juice, Detergent, soda
juice, detergent, Soda
5 Window Cleaner, Soda
cleaner, soda

Co-Occurrence of Products
Window
OJ Cleaner Milk Soda Detergent
OJ 4 1 1 2 1
Window Cleaner 1 2 1 1 0
Milk 1 1 1 0 0
Soda 2 1 0 3 1
Detergent 1 0 0 1 2

27
How Does It Work?
 The co-occurrence table contains some simple patterns
 Orange juice and soda are more likely to be purchased together than any other two items
 Detergent is never purchased with window cleaner or milk
 Milk is never purchased with soda or detergent
 These simple observations are examples of Associations and may suggest a formal rule like:
 If a customer purchases soda, THEN the customer also purchases orange juice

Window
OJ Cleaner Milk Soda Detergent
OJ 1 4 1 2 1
Window Cleaner 2 1 1 1 0
Milk 1 1 1 0 0
Soda 1 2 0 3 1
Detergent 0 1 0 1 2

28
How Good Are the Rules?
 In the data, two of five transactions include both soda and orange juice, These
two transactions support the rule. The support for the rule is two out of five or
40%
 Since both transactions that contain soda also contain orange juice there is a
high degree of confidence in the rule. In fact every transaction that contains
soda contains orange juice. So the rule If soda, THEN orange juice has a
confidence of 100%.

29
Confidence and Support - How
Good Are the Rules
 A rule must have some minimum user-specified confidence
 1 & 2 -> 3 has a 90% confidence if when a customer
bought 1 and 2, in 90% of the cases, the customer also
bought 3.
 A rule must have some minimum user-specified support
 1 & 2 -> 3 should hold in some minimum percentage of
transactions to have value.

30
Confidence and Support
Transaction ID # Items
1 { 1, 2, 3 }
For minimum support = 50% = 2 transactions
2 { 1,3 }
and minimum confidence = 50%
3 { 1,4 }
4 { 2, 5, 6 }
Frequent One Item Set Support
{1} 75 %
{2} 50 %
For the rule 1=> 3:
{3} 50 % Support = Support({1,3}) = 50%
{4} 25 % Confidence (1->3) = Support ({1,3})/Support({1}) = 66%
Confidence (3->1)= Support ({1,3})/Support({3}) = 100%
Frequent Two Item Set Support
{ 1,2 } 25 %
{ 1,3 } 50 %
{ 1,4 } 25 %
{ 2,3 } 25 %

31
Association Examples
 Find all rules that have “Diet Coke” as a result. These rules may help
plan what the store should do to boost the sales of Diet Coke.
 Find all rules that have “Yogurt” in the condition. These rules may
help determine what products may be impacted if the store
discontinues selling “Yogurt”.
 Find all rules that have “Brats” in the condition and “mustard” in the
result. These rules may help in determining the additional items that
have to be sold together to make it highly likely that mustard will
also be sold.
 Find the best k rules that have “Yogurt” in the result.

32
Example - Minimum Support Pruning / Rule
Generation
Scan Database Find Pairings Find Level of Support

Transaction ID # Items Itemset Support Itemset Support

1 { 1, 3, 4 } {1} 2 {2} 3
2 { 2, 3, 5 } {2} 3 {3} 3
3 { 1, 2, 3, 5 } {3} 3 {5} 3
4 { 2, 5 } {4} 1
{5} 3

Scan Database Find Pairings Find Level of Support

Itemset Itemset Support Itemset Support

{2} { 2, 3 } 2 { 2, 5 } 3
{3} { 2, 5 } 3
Two rules with the highest support
{5} { 3, 5 } 2 for two item set: 2->5 and 5->2

33
Other Association Rule Applications
 Quantitative Association Rules
 Age[35..40] and Married[Yes] -> NumCars[2]
 Association Rules with Constraints
 Find all association rules where the prices of items are > 100 dollars
 Temporal Association Rules
 Diaper -> Beer (1% support, 80% confidence)
 Diaper -> Beer (20%support) 7:00-9:00 PM weekdays
 Optimized Association Rules
 Given a rule (l < A < u) and X -> Y, Find values for l and u such that support
greater than certain threshold and maximizes a support and confidence.
 Check Balance [$ 30,000 .. $50,000] -> Certificate of Deposit (CD)= Yes

+
34
Classification by Decision Tree
Learning
 A classic machine learning / data mining problem
 Develop rules for when a transaction belongs to a
class based on its attribute values
 Smaller decision trees are better
 ID3 is one particular algorithm

35
A Database… (Training Dataset)
Age Income Student Credit_Rating Buys_Computer
<=30 High No Fair No
<=30 High No Excellent No
31…40 High No Fair Yes
>40 Medium No Fair Yes
>40 Low Yes Fair Yes
>40 Low Yes Excellent No
31…40 Low Yes Excellent Yes
<=30 Medium No Fair No
<=30 Low Yes Fair Yes
>40 Medium Yes Fair Yes
<=30 Medium Yes Excellent Yes
31…40 Medium No Excellent Yes
31…40 High Yes Fair Yes
>40 Medium No Excellent No

36
Output: A Decision Tree
Age?

<=30 31..40 >40

Student? Yes Credit rating?

no yes excellent fair

No Yes Yes

37
Algorithm: Decision Tree Induction
 Basic algorithm (a greedy algorithm)
 Tree is constructed in a top-down recursive divide-and-conquer manner

 At start, all the training examples are at the root

 Attributes are categorical (if continuous-valued, they are discretized in

advance)
 Examples are partitioned recursively based on selected attributes

 Test attributes are selected on the basis of a heuristic or statistical

measure (e.g., information gain)

 Conditions for stopping partitioning
 All samples for a given node belong to the same class

 There are no remaining attributes for further partitioning – majority

voting is employed for classifying the leaf

 There are no samples left

38
Different Possibilities for Partitioning Tuples
Based on Splitting Criterion

39
Attribute Selection Measure: Information
Gain (ID3/C4.5)
 Select the attribute with the highest information gain
 Let pi be the probability that an arbitrary tuple in D belongs to
class Ci, estimated by |Ci, D|/|D|
 Expected information (entropy) needed to classify a tuple in D:
m
Info ( D )    pi log 2 ( pi )
 Information needed (after using A to split D i into
1 v partitions) to
classify D:
v | Dj |
InfoA (D)    Info(Dj )
 Information gained by branching on attributej 1A| D |

Gain(A) Info(D) InfoA(D)

40
Attribute Selection: Information Gain

g Class P: buys_computer = “Yes” 5 4

Info age ( D )  I ( 2 ,3 )  I ( 4 ,0 )
g Class N: buys_computer = “No” 14 14
9 9 5 5 5
Info ( D )  I (9,5)   log 2 ( )  log 2 ( )  0 .940  I (3, 2 )  0 .694
14 14 14 14 14
age pi ni I(pi, ni) 5
I ( 2,3)means “age <=30” has 5 out of
<=30 2 3 0.971 14
14 samples, with 2 yes’es and 3
31…40 4 0 0
no’s. Hence
>40 3 2 0.971
age income student credit_rating buys_computer Gain ( age )  Info ( D )  Info age ( D )  0.246
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes Similarly,
>40 medium no fair yes
>40
>40
low
low
yes
yes
fair
excellent
yes
no
Gain(income)  0.029
31…40
<=30
low
medium
yes
no
excellent
fair
yes
no Gain( student )  0.151
<=30 low yes fair yes
>40 medium yes fair yes Gain(credit _ rating )  0.048
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes 41
>40 medium no excellent no
Attribute Age has the highest information gain

42
Computing Information-Gain for
Continuous-Value Attributes
 Let attribute A be a continuous-valued attribute
 Must determine the best split point for A
 Sort the value A in increasing order
 Typically, the midpoint between each pair of adjacent values is considered
as a possible split point
 (ai+ai+1)/2 is the midpoint between the values of ai and ai+1
 The point with the minimum expected information requirement for A is
selected as the split-point for A
 Split:
 D1 is the set of tuples in D satisfying A ≤ split-point, and D2 is the set of
tuples in D satisfying A > split-point

43
Gain Ratio for Attribute Selection (C4.5)
 Information gain measure is biased towards attributes with a
large number of values
 C4.5 (a successor of ID3) uses gain ratio to overcome the
problem (normalization to information gain)
v | Dj | | Dj |
SplitInfoA ( D)    log 2 ( )
j 1 |D| |D|
 GainRatio(A) = Gain(A)/SplitInfo(A)
 Ex. 4 4 6 6 4 4
SplitInfo A ( D )    log 2 ( )   log 2 ( )   log 2 ( )  0 .926
14 14 14 14 14 14
 gain_ratio(income) = 0.029/0.926 = 0.031
 The attribute with the maximum gain ratio is selected as the
splitting attribute

44
Gini index (CART, IBM IntelligentMiner)

 If a data set D contains examples from n classes, gini index, gini(D) is defined
as n
gini ( D )  1   p 2j
j 1
where pj is the relative frequency of class j in D
 If a data set D is split on A into two subsets D1 and D2, the gini index gini(D) is
defined as |D | |D |
gini A ( D )  1 gini ( D 1)  2 gini ( D 2 )
|D | |D |
 Reduction in Impurity:
gini( A)  gini(D)  giniA ( D)
 The attribute provides the smallest ginisplit(D) (or the largest reduction in
impurity) is chosen to split the node (need to enumerate all the possible
splitting points for each attribute) 45
Gini index (CART, IBM IntelligentMiner)

 Ex. D has 9 tuples in buys_computer = “yes” and 5 in “no”

2 2
   
9 5
gini( D)  1        0.459
 14   14 
 Suppose the attribute income partitions D into 10 in D1: {low, medium} and 4
 10  4
in D2 giniincome{low,medium} ( D)   Gini ( D1 )   Gini ( D1 )
 14   14 

but gini{medium,high} is 0.30 and thus the best since it is the lowest
 All attributes are assumed continuous-valued
 May need other tools, e.g., clustering, to get the possible split values
 Can be modified for categorical attributes
46
Comparing Attribute Selection Measures

 The three measures, in general, return good results but

 Information gain:
 biased towards multivalued attributes
 Gain ratio:
 tends to prefer unbalanced splits in which one partition is much
smaller than the others
 Gini index:
 biased to multivalued attributes
 has difficulty when # of classes is large
 tends to favor tests that result in equal-sized partitions and
purity in both partitions

47
Regression

 Used to map a data item to a real valued

prediction variable
 Regression involves learning of the function that
does this mapping
 Fits some known type of function – linear or any
other polynomial
 Goal: Predict class Ci = f(x1, x2, .. Xn)
 Regression: a*x1 + b*x2 + c = Ci.

48
Regression

 Example
 A person wishes to reach a certain level of savings
before retirement
 He wants to predict his savings value based on its
current value and several past values
 He uses a simple linear regression formula to predict
this value by fitting past behavior to a linear function
and then use this function to predict value at any point
of time

49
Time series analysis
 Value of an attribute is examined as it varies over time
 Evenly spaced time points – daily, weekly, horly etc.
 Three basic functions in time series analysis
 Distance measures are used to determine similarity between
time series
 Structure of the line is examined to determine its behavior
 Use historical time series plot to predict future values
 Application
 Stock market analysis – whether to buy a stock or not

50
Clustering : Unsupervised
Learning
 Similar to classification except groups are not
predefined rather defined by data alone
 Most similar data are grouped into same clusters
 Dissimilar data should be in different clusters

51
Clustering Examples

 Segment customer database based on similar

buying patterns.
 Group houses in a town into neighborhoods
based on similar features.
 Identify new plant species
 Identify similar Web usage patterns

52
Clustering
 Unsupervised learning when old data with class labels
not available e.g. when introducing a new product.
 Group/cluster existing customers based on time
series of payment history such that similar customers
in same cluster.
 Key requirement: Need a good measure of similarity
between instances.
 Identify micro-markets and develop policies for each

53
Clustering Example

54
Clustering Houses

Size Based
Geographic Distance Based
55
Clustering Issues

 Outlier handling
 Dynamic data
 Interpreting results
 Evaluating results
 Number of clusters
 Data to be used
 Scalability

56
Impact of Outliers on Clustering

57
Clustering Problem

 Given a database D={t1,t2,…,tn} of tuples and an

integer value k, the Clustering Problem is to
define a mapping f:Dg{1,..,k} where each ti is
assigned to one cluster Kj, 1<=j<=k.
 A Cluster, Kj, contains precisely those tuples
mapped to it.
 Unlike classification problem, clusters are not
known a priori.

58
Types of Clustering
 Hierarchical – Nested set of clusters created.
 Partitional – One set of clusters created.
 Incremental – Each element handled one at a
time.
 Simultaneous – All elements handled together.
 Overlapping/Non-overlapping

59
Agglomerative Example
A B C D E
A B
A 0 1 2 2 3
B 1 0 2 4 3
C 2 2 0 1 5 E C
D 2 4 1 0 3
E 3 3 5 3 0
D
Threshold of
1 2 34 5

A B C D E
61
Partitional methods: K-means
 Criteria: minimize sum of square of distance
 Between each point and centroid of the cluster.
 Between each pair of points in the cluster
 Algorithm:
 Select initial partition with K clusters: random, first K, K
separated points
 Repeat until stabilization:
 Assign each point to closest cluster center
 Generate new cluster centers
 Adjust clusters by merging/splitting

62
Collaborative Filtering

 Given database of user preferences, predict

preference of new user
 Example: predict what new movies you will like based
on
 your past preferences
 others with similar past preferences
 their preferences for the new movies
 Example: predict what books/CDs a person may want
to buy
 (and suggest it, or give discounts to tempt customer)

63
Mining market
 Around 20 to 30 mining tool vendors
 Major tool players:
 Clementine,
 IBM’s Intelligent Miner,
 SGI’s MineSet,
 SAS’s Enterprise Miner.
 All pretty much the same set of tools
 Many embedded products:
 fraud detection:
 electronic commerce applications,
 health care,
 customer relationship management: Epiphany

64
Large-scale Endeavors
Products
Clustering Classification Association Sequence Deviation
SAS Decision
Trees
SPSS  

Oracle  ANN
(Darwin)
IBM Time Decision   
Series Trees
DBMiner  
(Simon Fraser)

65
Vertical integration: Mining on the web

 Web log analysis for site design:

 what are popular pages,
 what links are hard to find.
 Electronic stores sales enhancements:
 recommendations, advertisement:
 Collaborative filtering: Net perception, Wisewire
 Inventory control: what was a shopper looking for and
could not find..

66
Some success stories
 Network intrusion detection using a combination of sequential
rule discovery and classification tree on 4 GB DARPA data
 Won over (manual) knowledge engineering approach
 http://www.cs.columbia.edu/~sal/JAM/PROJECT/ provides good detailed
description of the entire process
 Major US bank: customer attrition prediction
 First segment customers based on financial behavior: found 3 segments
 Build attrition models for each of the 3 segments
 40-50% of attritions were predicted == factor of 18 increase
 Targeted credit marketing: major US banks
 find customer segments based on 13 months credit balances
 build another response model based on surveys
 increased response 4 times -- 2%

67
Relationship with other fields

 Overlaps with machine learning, statistics, artificial

intelligence, databases, visualization but more stress
on
 scalability of number of features and instances
 stress on algorithms and architectures whereas
foundations of methods and formulations provided by
statistics and machine learning.
 automation for handling large, heterogeneous data

68
The Future
 Database – RDBMS & SQL are the two milestones in the evolution of
Database systems
 Currently, data mining is little more than a set of tools that can be used
to uncover previously hidden information in databases
 Many tools are available, but no all-encompassing model or approach
 Future is to create an all-encompassing tools, better integrated, less
human interaction and human interpretation
 A major development could be creation of sophisticated query
language that includes normal SQL and complicated OLAP functions
 DMQL is a step towards that

69
Thank You

Data Mining Notes For BCA 5th Sem 2019 PDF
No ratings yet
Data Mining Notes For BCA 5th Sem 2019 PDF
46 pages
1) Intro To Datamining
No ratings yet
1) Intro To Datamining
17 pages
Unit-4 Introduction To Data Mining
No ratings yet
Unit-4 Introduction To Data Mining
26 pages
Chapter 1 Data Mining Lecture Note
No ratings yet
Chapter 1 Data Mining Lecture Note
31 pages
V1-CH-6-Classification and Prediction
No ratings yet
V1-CH-6-Classification and Prediction
38 pages
IT326 - Ch1
100% (1)
IT326 - Ch1
17 pages
Lecture02 Frameworks Platforms-Part1
No ratings yet
Lecture02 Frameworks Platforms-Part1
40 pages
Data Mining: Business Intelligence
No ratings yet
Data Mining: Business Intelligence
68 pages
Data Mining
No ratings yet
Data Mining
31 pages
Datamining 1
No ratings yet
Datamining 1
30 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
44 pages
Supervised Learning Part1
No ratings yet
Supervised Learning Part1
42 pages
Data Mining and Predictive Modelling: (CSET228)
No ratings yet
Data Mining and Predictive Modelling: (CSET228)
14 pages
Data and Data Mining
No ratings yet
Data and Data Mining
9 pages
CH 1 Intro To Data Mining
No ratings yet
CH 1 Intro To Data Mining
17 pages
Data Mining: Steps and Functionalities
No ratings yet
Data Mining: Steps and Functionalities
17 pages
Data Mining Overview
No ratings yet
Data Mining Overview
14 pages
Ch2 DTasks
No ratings yet
Ch2 DTasks
44 pages
Introduction Am
No ratings yet
Introduction Am
74 pages
DWDWM Unit1
No ratings yet
DWDWM Unit1
17 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
10 pages
Analytics Prepbook Laterals 2019-2020
100% (1)
Analytics Prepbook Laterals 2019-2020
40 pages
Unit 1 Datamining
No ratings yet
Unit 1 Datamining
16 pages
Data Science Interview Questions Leaked
100% (3)
Data Science Interview Questions Leaked
12 pages
Chapter 6 Data Mining
No ratings yet
Chapter 6 Data Mining
39 pages
Session 2 - Foundations of Data and Information - 2024
No ratings yet
Session 2 - Foundations of Data and Information - 2024
33 pages
Practice
No ratings yet
Practice
5 pages
Unit-3
No ratings yet
Unit-3
53 pages
Chapter 1 Intro
No ratings yet
Chapter 1 Intro
23 pages
Understanding Data Mining
No ratings yet
Understanding Data Mining
21 pages
What Is Data Mining
No ratings yet
What Is Data Mining
8 pages
Unit 3.1
No ratings yet
Unit 3.1
23 pages
01 - Introduction To Datamining
No ratings yet
01 - Introduction To Datamining
19 pages
Prof. Chandan Singhavi
No ratings yet
Prof. Chandan Singhavi
86 pages
Data Mining and Data Profiling - Nargis Hamid Monami
No ratings yet
Data Mining and Data Profiling - Nargis Hamid Monami
7 pages
important questions unit-1
No ratings yet
important questions unit-1
20 pages
Data Visualization Using Business Intelligence (MDS204) : Arti Yadav Einfach Bussiness Analytics PVT LTD
No ratings yet
Data Visualization Using Business Intelligence (MDS204) : Arti Yadav Einfach Bussiness Analytics PVT LTD
60 pages
Data Mining
No ratings yet
Data Mining
8 pages
Assignment Solution
No ratings yet
Assignment Solution
27 pages
Topic10 - Data Mining
No ratings yet
Topic10 - Data Mining
29 pages
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
10 pages
Data Mining Roles in Extracting The Knowledge: Full Length Research Paper
No ratings yet
Data Mining Roles in Extracting The Knowledge: Full Length Research Paper
6 pages
2 - Business Problems and Data Science Solutions
No ratings yet
2 - Business Problems and Data Science Solutions
26 pages
Data Mining Concept (MMU)
No ratings yet
Data Mining Concept (MMU)
38 pages
Introduction
No ratings yet
Introduction
27 pages
Lecture_01_11jan
No ratings yet
Lecture_01_11jan
29 pages
AD3491 - Unit 1 - Introduction to Data Science Important Questions 2 Marks With Answer --3-8
No ratings yet
AD3491 - Unit 1 - Introduction to Data Science Important Questions 2 Marks With Answer --3-8
6 pages
Data Mining Questions
100% (1)
Data Mining Questions
7 pages
Business Analytics
No ratings yet
Business Analytics
56 pages
Data Warehousing & Data Mining Unit-3 Notes
No ratings yet
Data Warehousing & Data Mining Unit-3 Notes
27 pages
DataWarehouseMining Complete Notes
No ratings yet
DataWarehouseMining Complete Notes
55 pages
Discussion Questions BA
No ratings yet
Discussion Questions BA
11 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
25 pages
Data Mining
No ratings yet
Data Mining
8 pages
Fundamentals of Data Mining
No ratings yet
Fundamentals of Data Mining
36 pages
Data Mining Concepts
100% (3)
Data Mining Concepts
122 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
15 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Explanation of Benefits
No ratings yet
Explanation of Benefits
5 pages
On Francis Fukuyama
100% (1)
On Francis Fukuyama
54 pages
Exercise Integral Calculus
No ratings yet
Exercise Integral Calculus
1 page
Kenneth Foner - Getting A Quick Fix On Comonads
No ratings yet
Kenneth Foner - Getting A Quick Fix On Comonads
12 pages
30-Longest Increasing Subsequence-22-03-2024
No ratings yet
30-Longest Increasing Subsequence-22-03-2024
23 pages
MoteView Users Manual 7430-0008-04 B
No ratings yet
MoteView Users Manual 7430-0008-04 B
64 pages
Anders 1958 The Obsolescence of Privacy PDF
No ratings yet
Anders 1958 The Obsolescence of Privacy PDF
27 pages
Lms Activity 3 Two-Storey Residential House Magaddon, Engel A. Ground Floor Plan
No ratings yet
Lms Activity 3 Two-Storey Residential House Magaddon, Engel A. Ground Floor Plan
1 page
Hacking The Matrix
0% (2)
Hacking The Matrix
6 pages
Techniques For The Seismic Rehabilitation of Existing Buildings
No ratings yet
Techniques For The Seismic Rehabilitation of Existing Buildings
167 pages
7 - Classification
No ratings yet
7 - Classification
71 pages
Tle Module 16
No ratings yet
Tle Module 16
14 pages
Underground Fire Protection Pipes & Equipments
No ratings yet
Underground Fire Protection Pipes & Equipments
6 pages
Download full Animal Skeletons and Anatomy An Image Archive for Artists and Designers Kale James ebook all chapters
100% (1)
Download full Animal Skeletons and Anatomy An Image Archive for Artists and Designers Kale James ebook all chapters
34 pages
Hiring Manager ESG and Sustainability Services 1723318287
No ratings yet
Hiring Manager ESG and Sustainability Services 1723318287
1 page
Cc4ii8 en VS
No ratings yet
Cc4ii8 en VS
1 page
Ladakh 6N-7D
No ratings yet
Ladakh 6N-7D
2 pages
Article THE CPA GOES CSI
No ratings yet
Article THE CPA GOES CSI
5 pages
Finals MS Powerpoint SGN
No ratings yet
Finals MS Powerpoint SGN
18 pages
Service Manual: Nsx-Ds8
No ratings yet
Service Manual: Nsx-Ds8
44 pages
2751369-206 - Rev B
No ratings yet
2751369-206 - Rev B
1 page
Swimming With Dolphins
No ratings yet
Swimming With Dolphins
4 pages
Image Cse No Model No Part No Description
No ratings yet
Image Cse No Model No Part No Description
4 pages
Notes On Business Studies (The Nature of Business)
No ratings yet
Notes On Business Studies (The Nature of Business)
13 pages
Bhimashankar Resume 2021
No ratings yet
Bhimashankar Resume 2021
3 pages
BMEP Plus v1 0
100% (1)
BMEP Plus v1 0
6 pages
English Sheet Test 3, 5 & 9 10.04.25 Home Work
No ratings yet
English Sheet Test 3, 5 & 9 10.04.25 Home Work
5 pages
Wat Remover Info
0% (1)
Wat Remover Info
4 pages
ELC Test 1
No ratings yet
ELC Test 1
26 pages
DME Question Bank
No ratings yet
DME Question Bank
18 pages