Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Bioinformatics_lectures / lecture8.pptx
Скачиваний:
7
Добавлен:
21.02.2016
Размер:
369.61 Кб
Скачать

Data Mining Algorithms

Determine the preference criterion

In the face of two models, which one is “better”

Examples: goodness of fit, prediction accuracy, size/complexity, etc.

Search algorithm

Good models are found by searching the space of all possible models

How is this space organized and searched?

Data Mining Models

Mathematical Functions

Mathematical combination of attribute values

E.g. linear model, non-linear model, support vectors, etc.

CPU performance prediction

PRP 55.9 0.489MYCT 0.0153MMIN 0.0056MMAX0.6410CACH 0.2700CHMIN 1.480CHMAX

Data Mining Models

Decision Trees

 

>= 10 hours

Study

<10 hours

 

 

 

Do Homework

 

Test Well

Yes

 

No

Yes

No

Test Well

C

C

F

No

 

 

Yes

 

 

 

A

B

 

 

 

Data Mining Models

Neural Networks

0.80.23

-0.48

0.5

1.5 0.67

 

1.93

-0.88

-0.81

-0.4 0.18

 

Data Mining Models

Mixture Models

Data Mining Models

Bayesian Networks

P(B)

.001

A P(J) T 0.90 F 0.05

 

 

 

P(E)

Burglary

Earthquake .002

 

B

E

P(A)

Alarm

T

T

0.95

T

F

0.95

 

F

T

0.29

 

F

F

0.001

John Calls

A P(M)

Mary Calls T

0.70

 

F

0.01

Searching the Model Space

Concept generalization is searching

Almost all search algorithms are heuristic

Optimal models are not guaranteed

Enumerating the space involve bias

Language bias – what the model can represent

Search bias – which models are ignored

Searching the Model Space

 

>= 10 hours

 

Do Homework

Yes

 

No

Test Well

C

 

Yes

 

No

A

B

 

Model 1

Study

 

<10 hours

 

 

 

 

 

 

Test Well

 

 

 

 

 

Yes

No

 

Model 2

 

 

 

 

 

 

C

F

 

 

 

 

 

 

 

 

 

Study

 

 

 

 

>= 10 hours

 

<10 hours

 

 

 

Test Well

 

 

Homework

 

 

Yes

 

No

Yes

 

No

 

Good Project

C

Test Well

F

 

 

 

 

Yes

No

 

Yes

No

 

 

A

B

 

B

C

 

THANK YOU

49

Соседние файлы в папке Bioinformatics_lectures