Лабораторная работа
.docМинистерство науки и образования РФ
Санкт-Петербургский Государственный Электротехнический Университет
Кафедра МОЭВМ
Отчёт по лабораторной работе № 1
по дисциплине
"Базы знаний и экспертные системы"
Вариант №1
Выполнил: Белов Д.А.
Группа : 3341
Санкт-Петербург
2006
-
Описание входных данных
Вариант 1 , соответствует файлу : contact-lenses.arff
Данные из файла представляют собой информацию:
- о людях разных возрастов
- состоянии их зрения.
В таблице указан ряд параметров:
- возраст
- spectacle-prescrip
- астигматизм
- tear-prod-rate
- контактные линзы.
-
Анализ задачи и выделение необходимого класса алгоритмов
Для анализа применяем алгоритмы классификации, так как единственным полезным знанием извлекаемым из этих данных, является построение классификации возрастов людей нуждающихся в контактных линзах
Из алгоритмов классификации выбраны построение правил и деревья.
-
Результаты выполнения алгоритмов.
ID3
=== Run information ===
Scheme: weka.classifiers.trees.Id3
Relation: contact-lenses
Instances: 24
Attributes: 5
age
spectacle-prescrip
astigmatism
tear-prod-rate
contact-lenses
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===
Id3
tear-prod-rate = reduced: none
tear-prod-rate = normal
| astigmatism = no
| | age = young: soft
| | age = pre-presbyopic: soft
| | age = presbyopic
| | | spectacle-prescrip = myope: none
| | | spectacle-prescrip = hypermetrope: soft
| astigmatism = yes
| | spectacle-prescrip = myope: hard
| | spectacle-prescrip = hypermetrope
| | | age = young: hard
| | | age = pre-presbyopic: none
| | | age = presbyopic: none
Time taken to build model: 0 seconds
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 17 70.8333 %
Incorrectly Classified Instances 7 29.1667 %
Kappa statistic 0.4381
Mean absolute error 0.1944
Root mean squared error 0.441
Relative absolute error 51.4706 %
Root relative squared error 100.965 %
Total Number of Instances 24
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure Class
0.8 0.053 0.8 0.8 0.8 soft
0.25 0.1 0.333 0.25 0.286 hard
0.8 0.444 0.75 0.8 0.774 none
=== Confusion Matrix ===
a b c <-- classified as
4 0 1 | a = soft
0 1 3 | b = hard
1 2 12 | c = none
Naive Bayes
=== Run information ===
Scheme: weka.classifiers.bayes.NaiveBayes
Relation: contact-lenses
Instances: 24
Attributes: 5
age
spectacle-prescrip
astigmatism
tear-prod-rate
contact-lenses
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===
Naive Bayes Classifier
Class soft: Prior probability = 0.22
age: Discrete Estimator. Counts = 3 3 2 (Total = 8)
spectacle-prescrip: Discrete Estimator. Counts = 3 4 (Total = 7)
astigmatism: Discrete Estimator. Counts = 6 1 (Total = 7)
tear-prod-rate: Discrete Estimator. Counts = 1 6 (Total = 7)
Class hard: Prior probability = 0.19
age: Discrete Estimator. Counts = 3 2 2 (Total = 7)
spectacle-prescrip: Discrete Estimator. Counts = 4 2 (Total = 6)
astigmatism: Discrete Estimator. Counts = 1 5 (Total = 6)
tear-prod-rate: Discrete Estimator. Counts = 1 5 (Total = 6)
Class none: Prior probability = 0.59
age: Discrete Estimator. Counts = 5 6 7 (Total = 18)
spectacle-prescrip: Discrete Estimator. Counts = 8 9 (Total = 17)
astigmatism: Discrete Estimator. Counts = 8 9 (Total = 17)
tear-prod-rate: Discrete Estimator. Counts = 13 4 (Total = 17)
Time taken to build model: 0 seconds
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 17 70.8333 %
Incorrectly Classified Instances 7 29.1667 %
Kappa statistic 0.4381
Mean absolute error 0.2545
Root mean squared error 0.3326
Relative absolute error 67.3578 %
Root relative squared error 76.1544 %
Total Number of Instances 24
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure Class
0.8 0.053 0.8 0.8 0.8 soft
0.25 0.1 0.333 0.25 0.286 hard
0.8 0.444 0.75 0.8 0.774 none
=== Confusion Matrix ===
a b c <-- classified as
4 0 1 | a = soft
0 1 3 | b = hard
1 2 12 | c = none
LMT
=== Run information ===
Scheme: weka.classifiers.trees.LMT -I -1 -M 15
Relation: contact-lenses
Instances: 24
Attributes: 5
age
spectacle-prescrip
astigmatism
tear-prod-rate
contact-lenses
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===
Logistic model tree
------------------
: LM_1:36/36 (24)
Number of Leaves : 1
Size of the Tree : 1
LM_1:
Class 0 :
-0.05 +
[age=pre-presbyopic] * 2.86 +
[age=presbyopic] * -4.85 +
[spectacle-prescrip] * 9.59 +
[astigmatism] * -22.94 +
[tear-prod-rate] * 1.5
Class 1 :
-11.45 +
[age=young] * 3.05 +
[age=pre-presbyopic] * -1.57 +
[spectacle-prescrip] * -13.85 +
[astigmatism] * 17.09 +
[tear-prod-rate] * 2.36
Class 2 :
25.46 +
[age=young] * -8.13 +
[age=presbyopic] * 2.29 +
[tear-prod-rate] * -26.68
Time taken to build model: 0.2 seconds
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 17 70.8333 %
Incorrectly Classified Instances 7 29.1667 %
Kappa statistic 0.4766
Mean absolute error 0.2484
Root mean squared error 0.3748
Relative absolute error 65.7427 %
Root relative squared error 85.8137 %
Total Number of Instances 24
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure Class
0.8 0.053 0.8 0.8 0.8 soft
0.5 0.15 0.4 0.5 0.444 hard
0.733 0.333 0.786 0.733 0.759 none
=== Confusion Matrix ===
a b c <-- classified as
4 0 1 | a = soft
0 2 2 | b = hard
1 3 11 | c = none
J48
=== Run information ===
Scheme: weka.classifiers.trees.J48 -C 0.25 -M 2
Relation: contact-lenses
Instances: 24
Attributes: 5
age
spectacle-prescrip
astigmatism
tear-prod-rate
contact-lenses
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===
J48 pruned tree
------------------
tear-prod-rate = reduced: none (12.0)
tear-prod-rate = normal
| astigmatism = no: soft (6.0/1.0)
| astigmatism = yes
| | spectacle-prescrip = myope: hard (3.0)
| | spectacle-prescrip = hypermetrope: none (3.0/1.0)
Number of Leaves : 4
Size of the tree : 7
Time taken to build model: 0.03 seconds
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 20 83.3333 %
Incorrectly Classified Instances 4 16.6667 %
Kappa statistic 0.71
Mean absolute error 0.15
Root mean squared error 0.3249
Relative absolute error 39.7059 %
Root relative squared error 74.3898 %
Total Number of Instances 24
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure Class
1 0.053 0.833 1 0.909 soft
0.75 0.1 0.6 0.75 0.667 hard
0.8 0.111 0.923 0.8 0.857 none
=== Confusion Matrix ===
a b c <-- classified as
5 0 0 | a = soft
0 3 1 | b = hard
1 2 12 | c = none
-
Вывод
При выполнении работы были получены навыки применения алгоритмов DM для практического извлечения знаний из набора данных. Полученные результаты подтвердили правильность выбранных классов алгоритмов и подхода к поиску знаний.