Data Mining:
Concepts and Techniques
— Chapter 6 —
Jiawei Han
Department puter Science
University of Illinois at Urbana-Champaign
/~hanj
©2006 Jiawei Han and Micheline Kamber, All rights reserved
2011-1-19 School of Management, HUST 1
2011-1-19 School of Management, HUST 2
Chapter 6. Classification and Prediction
What is classification? What is Support Vector Machines (SVM)
prediction? Associative classification
Issues regarding classification Lazy learners (or learning from
and prediction your neighbors)
Classification by decision tree Other classification methods
induction
Prediction
Bayesian classification
Accuracy and error measures
Rule-based classification
Ensemble methods
Classification by back
Model selection
propagation
Summary
2011-1-19 School of Management, HUST 3
Classification vs. Prediction
Classification
predicts categorical class labels (discrete or nominal)
classifies data (constructs a model) based on the
training set and the values (class labels) in a
classifying attribute and uses it in classifying new data
Prediction
models continuous-valued functions, ., predicts
unknown or missing values
Typical applications
Credit approval
Target marketing
Medical diagnosis
Fraud detection
2011-1-19 School of Management, HUST 4
Classification—A Two-Step Process
Model construction: describing a set of predetermined classes
Each tuple/sample is assumed to belong to a predefined class,
as determined by the class label attribute
The set of tuples used for model construction is training set
The model is represented as classification rules, decision trees,
or mathematical formulae
Model usage: for classifying future or unknown objects
Estimate accuracy of the model
The known label of test sample pared with the
classified result from the model
Accuracy rate is the percentage of test set sa
数据挖掘课件数据挖掘06 来自淘豆网www.taodocs.com转载请标明出处.