Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Лабораторная работа1.doc
Скачиваний:
28
Добавлен:
01.05.2014
Размер:
76.8 Кб
Скачать

Instances: 14

Attributes: 5

outlook

temperature

humidity

windy

play

Test mode: split 35% train, remainder test

=== Clustering model (full training set) ===

Number of merges: 1

Number of splits: 0

Number of clusters: 21

node 0 [14]

| node 1 [5]

| | leaf 2 [1]

| node 1 [5]

| | leaf 3 [1]

| node 1 [5]

| | node 4 [2]

| | | leaf 5 [1]

| | node 4 [2]

| | | leaf 6 [1]

| node 1 [5]

| | leaf 7 [1]

node 0 [14]

| node 8 [6]

| | node 9 [2]

| | | leaf 10 [1]

| | node 9 [2]

| | | leaf 11 [1]

| node 8 [6]

| | leaf 12 [1]

| node 8 [6]

| | node 13 [3]

| | | leaf 14 [1]

| | node 13 [3]

| | | leaf 15 [1]

| | node 13 [3]

| | | leaf 16 [1]

node 0 [14]

| node 17 [3]

| | leaf 18 [1]

| node 17 [3]

| | leaf 19 [1]

| node 17 [3]

| | leaf 20 [1]

=== Model and evaluation on test split ===

Number of merges: 0

Number of splits: 0

Number of clusters: 7

node 0 [4]

| node 1 [2]

| | leaf 2 [1]

| node 1 [2]

| | leaf 3 [1]

node 0 [4]

| node 4 [2]

| | leaf 5 [1]

| node 4 [2]

| | leaf 6 [1]

Clustered Instances

0 2 ( 20%)

1 6 ( 60%)

2 1 ( 10%)

4 1 ( 10%)

Таким образом, алгоритм выдал результат 85% «против».

EM

Устанавливаем split 30%.

Scheme: weka.clusterers.EM -I 100 -N -1 -S 100 -M 1.0E-6

Relation: weather

Instances: 14

Attributes: 5

outlook

temperature

humidity

windy

play

Test mode: split 30% train, remainder test

=== Clustering model (full training set) ===

EM

==

Number of clusters selected by cross validation: 1

Cluster: 0 Prior probability: 1

Attribute: outlook

Discrete Estimator. Counts = 6 5 6 (Total = 17)

Attribute: temperature

Normal Distribution. Mean = 73.5714 StdDev = 6.3326

Attribute: humidity

Normal Distribution. Mean = 81.6429 StdDev = 9.9111

Attribute: windy

Discrete Estimator. Counts = 7 9 (Total = 16)

Attribute: play

Discrete Estimator. Counts = 10 6 (Total = 16)

=== Model and evaluation on test split ===

EM

==

Number of clusters selected by cross validation: 1

Cluster: 0 Prior probability: 1

Attribute: outlook

Discrete Estimator. Counts = 1 3 3 (Total = 7)

Attribute: temperature

Normal Distribution. Mean = 70.25 StdDev = 6.7593

Attribute: humidity

Normal Distribution. Mean = 75.25 StdDev = 9.7564

Attribute: windy

Discrete Estimator. Counts = 4 2 (Total = 6)

Attribute: play

Discrete Estimator. Counts = 3 3 (Total = 6)

Clustered Instances

0 10 (100%)

KMEANS:

Устанавливаем split 20%.

Log likelihood: -10.4135

Scheme: weka.clusterers.SimpleKMeans -N 2 -S 10

Relation: weather

Instances: 14

Attributes: 5

outlook

temperature

humidity

windy

play

Test mode: split 20% train, remainder test

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 3

Within cluster sum of squared errors: 16.23745631138724

Cluster centroids:

Cluster 0

Mean/Mode: sunny 75.8889 84.1111 FALSE yes

Std Devs: N/A 6.4893 8.767 N/A N/A

Cluster 1

Mean/Mode: overcast 69.4 77.2 TRUE yes

Std Devs: N/A 4.7223 12.3167 N/A N/A

=== Model and evaluation on test split ===

kMeans

======

Number of iterations: 2

Within cluster sum of squared errors: 0.0

Cluster centroids:

Cluster 0

Mean/Mode: rainy 71 91 TRUE no

Std Devs: N/A 0 0 N/A N/A

Cluster 1

Mean/Mode: rainy 65 70 TRUE no

Std Devs: N/A 0 0 N/A N/A

Clustered Instances

0 9 ( 75%)

1 3 ( 25%)

Выводы:

Видно, что более точные результаты были получены с помощью алгоритмов ассоциации-NAIVEBAYES и кластеризации-EM. Эти алгоритмы характерны тем, что в первом случае выявляются закономерности между классификаторами. Во втором случае все атрибуты разделяются на таксоны по признакам.