Wednesday, October 3, 2018

Plant Leaf Classification - Part 2

Plant Leaf Classification

Abstract

In this post, I am going on with the analysis of the plant leaf dataset as available from UCI Machine Learning repository at link:

https://archive.ics.uci.edu/ml/datasets/One-hundred+plant+species+leaves+data+set#

Classification models will be introduced.

Introduction

The dataset comprises sixteen samples each of one-hundred plant species. Its analysis was introduced within paper specified as ref. [1]. That paper describes a method designed to work in conditions of small training set size and possibly incomplete extraction of features. This motivates a separate processing of three feature types:

  • shape
  • texture
  • margin

Those are then combined to provide an overall indication of the species (and associated probability). For an accurate description of those feature, please see ref. [1] where the classification is implemented by a K-Nearest-Neighbor density estimator. The authors show the accuracy reached by KNN (with proportional and weighted proportional kernel density) for any combination of the datasets in use (see ref. [1] table 2, pag. 8). The top accuracy is 96% and it is achieved when using all three datasets.

In this post, I will take advantage of Discrimination Analysis for classification purposes and I will compare my results with the ones of ref. [1] table 2.

Packages

suppressPackageStartupMessages(library(caret))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))

Classification Models

load('PlantLeafEnvironment.RData')
set.seed(1023)
plant_leaf_model <- function(leaf_dataset) {

    train_idx <- createDataPartition(leaf_dataset$species, p = 0.7, list = FALSE)
    trControl <- trainControl(method = "repeatedcv",  number = 10, verboseIter = FALSE, repeats = 5)

    relevant_features <- setdiff(colnames(leaf_dataset), c("id", "species"))

    feat_sum <- paste(relevant_features, collapse = "+")
    frm <- as.formula(paste("species ~ ", feat_sum))
    frm

    train_set <- leaf_dataset[train_idx,]
    test_set <- leaf_dataset[-train_idx,]

    lda_fit <- train(frm, 
                     data = train_set,
                     method ="lda", 
                     tuneLength = 10,
                     preProcess = c("BoxCox", "center", "scale", "pca"),
                     trControl = trControl,
                     metric = 'Accuracy')
    
    lda_test_pred <- predict(lda_fit, test_set)

    cfm <- confusionMatrix(lda_test_pred, test_set$species)
    
    list(model=lda_fit, confusionMatrix = cfm)
}

Margin data

result <- plant_leaf_model(margin_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##   64 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: centered (64), scaled (64), principal component
##  signal extraction (64) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1080, 1080, 1080, 1082, 1079, 1081, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.7968584  0.7946337
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.8075000     0.8055556     0.7653905     0.8449891
margin_data_accuracy <- result$confusionMatrix$overall[1]

Shape Data

result <- plant_leaf_model(shape_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##   64 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: Box-Cox transformation (64), centered (64), scaled
##  (64), principal component signal extraction (64) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1081, 1074, 1086, 1071, 1077, 1081, ... 
## Resampling results:
## 
##   Accuracy  Kappa    
##   0.52309   0.5181073
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.5350000     0.5303030     0.4847666     0.5847119
shape_data_accuracy <- result$confusionMatrix$overall[1]

Texture Data

result <- plant_leaf_model(texture_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##   64 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: centered (64), scaled (64), principal component
##  signal extraction (64) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1085, 1076, 1083, 1076, 1077, 1083, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.7512698  0.7485512
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.7600000     0.7575758     0.7150596     0.8010455
texture_data_accuracy <- result$confusionMatrix$overall[1]

Margin+Shape Data

margin_shape_data <- left_join(margin_data, shape_data_2, by='id')
result <- plant_leaf_model(margin_shape_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##  128 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: Box-Cox transformation (64), centered (128), scaled
##  (128), principal component signal extraction (128) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1083, 1075, 1071, 1080, 1076, 1082, ... 
## Resampling results:
## 
##   Accuracy  Kappa    
##   0.945658  0.9450532
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.9450000     0.9444444     0.9179108     0.9652145
margin_shape_data_accuracy <- result$confusionMatrix$overall[1]

Margin+Texture data

margin_texture_data <- left_join(margin_data, texture_data_2, by='id')
result <- plant_leaf_model(margin_texture_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##  128 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: centered (128), scaled (128), principal component
##  signal extraction (128) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1083, 1083, 1074, 1084, 1077, 1084, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9598748  0.9594263
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.9750000     0.9747475     0.9545057     0.9879479
margin_texture_data_accuracy <- result$confusionMatrix$overall[1]

Shape+Texture data

shape_texture_data <- left_join(shape_data, texture_data_2, by='id')
result <- plant_leaf_model(shape_texture_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##  128 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: Box-Cox transformation (64), centered (128), scaled
##  (128), principal component signal extraction (128) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1081, 1078, 1075, 1083, 1078, 1080, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9096225  0.9086215
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.9100000     0.9090909     0.8775801     0.9361685
shape_texture_data_accuracy <- result$confusionMatrix$overall[1]

Margin+Shape+Texture data

result <- plant_leaf_model(leaves_data)
result$model
## Linear Discriminant Analysis 
## 
## 1199 samples
##  192 predictor
##  100 classes: 'Acer Campestre', 'Acer Capillipes', 'Acer Circinatum', 'Acer Mono', 'Acer Opalus', 'Acer Palmatum', 'Acer Pictum', 'Acer Platanoids', 'Acer Rubrum', 'Acer Rufinerve', 'Acer Saccharinum', 'Alnus Cordata', 'Alnus Maximowiczii', 'Alnus Rubra', 'Alnus Sieboldiana', 'Alnus Viridis', 'Arundinaria Simonii', 'Betula Austrosinensis', 'Betula Pendula', 'Callicarpa Bodinieri', 'Castanea Sativa', 'Celtis Koraiensis', 'Cercis Siliquastrum', 'Cornus Chinensis', 'Cornus Controversa', 'Cornus Macrophylla', 'Cotinus Coggygria', 'Crataegus Monogyna', 'Cytisus Battandieri', 'Eucalyptus Glaucescens', 'Eucalyptus Neglecta', 'Eucalyptus Urnigera', 'Fagus Sylvatica', 'Ginkgo Biloba', 'Ilex Aquifolium', 'Ilex Cornuta', 'Liquidambar Styraciflua', 'Liriodendron Tulipifera', 'Lithocarpus Cleistocarpus', 'Lithocarpus Edulis', 'Magnolia Heptapeta', 'Magnolia Salicifolia', 'Morus Nigra', 'Olea Europaea', 'Phildelphus', 'Populus Adenopoda', 'Populus Grandidentata', 'Populus Nigra', 'Prunus Avium', 'Prunus X Shmittii', 'Pterocarya Stenoptera', 'Quercus Afares', 'Quercus Agrifolia', 'Quercus Alnifolia', 'Quercus Brantii', 'Quercus Canariensis', 'Quercus Castaneifolia', 'Quercus Cerris', 'Quercus Chrysolepis', 'Quercus Coccifera', 'Quercus Coccinea', 'Quercus Crassifolia', 'Quercus Crassipes', 'Quercus Dolicholepis', 'Quercus Ellipsoidalis', 'Quercus Greggii', 'Quercus Hartwissiana', 'Quercus Ilex', 'Quercus Imbricaria', 'Quercus Infectoria sub', 'Quercus Kewensis', 'Quercus Nigra', 'Quercus Palustris', 'Quercus Phellos', 'Quercus Phillyraeoides', 'Quercus Pontica', 'Quercus Pubescens', 'Quercus Pyrenaica', 'Quercus Rhysophylla', 'Quercus Rubra', 'Quercus Semecarpifolia', 'Quercus Shumardii', 'Quercus Suber', 'Quercus Texana', 'Quercus Trojana', 'Quercus Variabilis', 'Quercus Vulcanica', 'Quercus x Hispanica', 'Quercus x Turneri', 'Rhododendron x Russellianum', 'Salix Fragilis', 'Salix Intergra', 'Sorbus Aria', 'Tilia Oliveri', 'Tilia Platyphyllos', 'Tilia Tomentosa', 'Ulmus Bergmanniana', 'Viburnum Tinus', 'Viburnum x Rhytidophylloides', 'Zelkova Serrata' 
## 
## Pre-processing: Box-Cox transformation (64), centered (192), scaled
##  (192), principal component signal extraction (192) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 1082, 1074, 1082, 1086, 1076, 1078, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9852904  0.9851256
result$confusionMatrix$overall[1:4]
##      Accuracy         Kappa AccuracyLower AccuracyUpper 
##     0.9850000     0.9848485     0.9676387     0.9944759
leaf_data_accuracy <- result$confusionMatrix$overall[1]

We can resume up the results achieved by linear discriminant analysis with the ones of ref. [1].

##   SHA TEX MAR prop_KNN_accuracy wprop_KNN__accuracy LDA_accuracy
## 1   V                     62.13               61.88       53.50
## 2       V                 72.94               72.75       76.00
## 3           V             75.00               75.75       80.75
## 4   V   V                 86.19               86.06       91.00
## 5   V       V             87.19               86.75       94.50
## 6       V   V             93.38               93.31       97.50
## 7   V   V   V             96.81               96.69       98.50

Conclusions

With the only exception of the shape dataset scenario, for all other cases as defined by the datasets in use, Linear Discriminant Analysis achieves higher accuracy with respect ref. [1] K-Nearest-Neighbor based models.

References

[1] Charles Mallah, James Cope and James Orwell, “PLANT LEAF CLASSIFICATION USING PROBABILISTIC INTEGRATION OF SHAPE, TEXTURE AND MARGIN FEATURES”,[https://www.researchgate.net/publication/266632357_Plant_Leaf_Classification_using_Probabilistic_Integration_of_Shape_Texture_and_Margin_Features]

[2]Leaf Dataset[https://archive.ics.uci.edu/ml/datasets/One-hundred+plant+species+leaves+data+set]

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.