Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Bioinformatics_lectures / lecture8.pptx
Скачиваний:
7
Добавлен:
21.02.2016
Размер:
369.61 Кб
Скачать

Knowledge Discovery in

Databases

Cleaning

Integration

Selection

Transformation

Data Evaluation

Mining Visualization

Data

Warehou Prepared

data

se

Patterns

Knowledge

Knowledge

Base

Data

Typical Tasks in Data Mining

ClassificationPredictionClustering

Association Analysis

Summarization

Typical Tasks in Data Mining

Classification

From data with known labels, create a classifier that determines which label to apply to a new observation

E.g. Label loan applications as low, medium, or high risk

Typical Tasks in Data Mining

Prediction

Given a collection of data with known numeric outputs, create a function that outputs a predicted value from a new set of inputs.

E.g. Given historical consumption of milk in the U.S., predict what the consumption will be over the next five years.

Typical Tasks in Data Mining

Clustering

Identify “natural” groupings in data

Unsupervised learning, no predefined groups

E.g. A city planner grouping houses by value, location, and house type.

Typical Tasks in Data Mining

Association Analysis

Identify relationships in data from co-occuring terms or items.

E.g. Analyze grocery store purchases to identify items most commonly purchased together. This is often used to create coupons and sales: buy chips and get $0.50 off salsa.

Typical Tasks in Data Mining

Summarization

Given a data set, summarize the important characteristics of the data.

E.g. calculate mean and standard deviation, determine statistical distribution, identify most commonly appearing attribute values, etc.

Typical Tasks in Data Mining

Sequence Analysis

Given data collected over time, identify trends in the data that may be used to predict future events occuring

E.g. Analyzing stock data to identify stocks that will perform well vs. those that will perform poorly.

What is Data Mining?

Data Mining Process

 

 

 

 

 

No

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Fit a Model

 

 

Calculate

 

 

Meet Criteria?

 

 

Performance

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Yes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Interpret

 

 

 

 

 

 

 

Model

 

 

 

 

 

 

 

 

 

Data Mining Algorithms

Apply/create a model

A model is an abstract description of data

What is the model’s function? (i.e. what task does it perform?)

How is the model represented? (I.e. mathematical function, rules,

Gaussian distribution)

Соседние файлы в папке Bioinformatics_lectures