Tuesday, October 2, 2018

Plant Leaf Classification - Part 1

Abstract

In this post, I am going to analyze the plant leaf dataset as made available by UCI Machine Learning repository at link:

https://archive.ics.uci.edu/ml/datasets/One-hundred+plant+species+leaves+data+set#

Exploratory analysis and models for plant leaf classification will be introduced.

Introduction

The dataset comprises sixteen samples each of one-hundred plant species. Its analysis was introduced within ref. [1]. That paper describes a method designed to work in conditions of small training set size and possibly incomplete extraction of features. This motivated a separate processing of three feature types:

  • shape
  • texture
  • margin

Those are then combined to provide an overall indication of the species (and associated probability). For an accurate description of those feature, please see ref. [1] where the classification is implemented by a K-Nearest-Neighbor density estimator. The authors show the accuracy reached by K-Nearest-Neighbor classification for any combination of the datasets in use (see ref. [1] table 2). The top accuracy is 96% and it is achieved when using all three datasets.

In this post, I introduce an exploratory analysis for those datasets. On next post classification models will be outlined.

Packages

suppressPackageStartupMessages(library(caret))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(corrplot))

Exploratory Analysis

First, we download the zip file providing all plant leaf datasets.

url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/00241/100%20leaves%20plant%20species.zip"
temp_file <- tempfile()
download.file(url, temp_file)

Actually, the zip file provides further content, such as a readme file, a pdf document and a folder tree providing with a set of images for each plant specie. However, the files we are specifically interested in are:

margin_file <- "100 leaves plant species/data_Mar_64.txt"
shape_file <- "100 leaves plant species/data_Sha_64.txt"
texture_file <- "100 leaves plant species/data_Tex_64.txt"

Unzipping the downloaded file into a temporary directory.

files_to_unzip <- c(margin_file, shape_file, texture_file)
unzip(temp_file, files = files_to_unzip, exdir=".", overwrite = TRUE)

Reading datasets of interest.

margin_data <- read.csv(margin_file, header=FALSE, sep=",", stringsAsFactors = TRUE)
head(margin_data)
##               V1       V2       V3       V4       V5       V6       V7
## 1 Acer Campestre 0.003906 0.003906 0.027344 0.033203 0.007812 0.017578
## 2 Acer Campestre 0.005859 0.013672 0.027344 0.025391 0.013672 0.029297
## 3 Acer Campestre 0.011719 0.001953 0.027344 0.044922 0.017578 0.042969
## 4 Acer Campestre 0.013672 0.011719 0.037109 0.017578 0.011719 0.087891
## 5 Acer Campestre 0.007812 0.009766 0.027344 0.025391 0.001953 0.005859
## 6 Acer Campestre 0.015625 0.003906 0.015625 0.046875 0.013672 0.064453
##         V8       V9      V10      V11      V12      V13      V14      V15
## 1 0.023438 0.005859 0.000000 0.015625 0.015625 0.015625 0.025391 0.000000
## 2 0.019531 0.000000 0.001953 0.021484 0.007812 0.003906 0.013672 0.003906
## 3 0.023438 0.000000 0.003906 0.019531 0.017578 0.005859 0.009766 0.005859
## 4 0.023438 0.000000 0.000000 0.027344 0.021484 0.009766 0.013672 0.001953
## 5 0.015625 0.000000 0.005859 0.017578 0.039062 0.007812 0.042969 0.001953
## 6 0.017578 0.003906 0.000000 0.033203 0.013672 0.009766 0.023438 0.000000
##        V16 V17      V18      V19      V20      V21      V22      V23 V24
## 1 0.015625   0 0.025391 0.027344 0.033203 0.009766 0.025391 0.001953   0
## 2 0.009766   0 0.044922 0.011719 0.007812 0.031250 0.017578 0.001953   0
## 3 0.015625   0 0.027344 0.017578 0.019531 0.019531 0.023438 0.001953   0
## 4 0.013672   0 0.021484 0.017578 0.011719 0.027344 0.009766 0.003906   0
## 5 0.007812   0 0.017578 0.027344 0.025391 0.003906 0.019531 0.003906   0
## 6 0.013672   0 0.031250 0.027344 0.017578 0.015625 0.011719 0.000000   0
##        V25      V26      V27      V28      V29      V30      V31      V32
## 1 0.003906 0.001953 0.044922 0.000000 0.031250 0.033203 0.015625 0.003906
## 2 0.001953 0.001953 0.037109 0.000000 0.029297 0.025391 0.039062 0.003906
## 3 0.011719 0.001953 0.042969 0.000000 0.011719 0.023438 0.035156 0.001953
## 4 0.011719 0.000000 0.050781 0.001953 0.019531 0.013672 0.011719 0.003906
## 5 0.009766 0.003906 0.017578 0.001953 0.009766 0.031250 0.015625 0.001953
## 6 0.007812 0.000000 0.027344 0.000000 0.015625 0.027344 0.031250 0.001953
##        V33      V34      V35      V36      V37      V38      V39      V40
## 1 0.000000 0.013672 0.001953 0.015625 0.015625 0.017578 0.033203 0.015625
## 2 0.000000 0.035156 0.000000 0.027344 0.015625 0.019531 0.029297 0.023438
## 3 0.001953 0.027344 0.000000 0.003906 0.017578 0.027344 0.033203 0.007812
## 4 0.000000 0.027344 0.000000 0.013672 0.007812 0.017578 0.035156 0.021484
## 5 0.001953 0.037109 0.000000 0.011719 0.037109 0.015625 0.039062 0.009766
## 6 0.001953 0.013672 0.003906 0.015625 0.027344 0.001953 0.021484 0.015625
##        V41      V42      V43      V44      V45      V46      V47      V48
## 1 0.003906 0.003906 0.031250 0.035156 0.015625 0.044922 0.007812 0.029297
## 2 0.005859 0.003906 0.017578 0.025391 0.011719 0.025391 0.005859 0.035156
## 3 0.001953 0.007812 0.035156 0.015625 0.011719 0.027344 0.003906 0.011719
## 4 0.000000 0.003906 0.035156 0.027344 0.015625 0.029297 0.007812 0.011719
## 5 0.001953 0.000000 0.050781 0.023438 0.005859 0.058594 0.003906 0.015625
## 6 0.003906 0.001953 0.025391 0.029297 0.013672 0.039062 0.009766 0.025391
##        V49      V50      V51      V52      V53      V54      V55      V56
## 1 0.037109 0.000000 0.027344 0.005859 0.001953 0.041016 0.000000 0.011719
## 2 0.033203 0.000000 0.015625 0.011719 0.003906 0.037109 0.001953 0.017578
## 3 0.019531 0.001953 0.013672 0.005859 0.007812 0.068359 0.003906 0.035156
## 4 0.023438 0.000000 0.017578 0.009766 0.007812 0.052734 0.003906 0.015625
## 5 0.021484 0.001953 0.015625 0.023438 0.005859 0.027344 0.000000 0.023438
## 6 0.031250 0.000000 0.019531 0.001953 0.000000 0.050781 0.003906 0.005859
##        V57      V58      V59      V60      V61      V62      V63      V64
## 1 0.000000 0.005859 0.035156 0.027344 0.033203 0.001953 0.000000 0.017578
## 2 0.000000 0.021484 0.017578 0.046875 0.005859 0.003906 0.003906 0.046875
## 3 0.000000 0.015625 0.021484 0.056641 0.009766 0.003906 0.000000 0.015625
## 4 0.001953 0.021484 0.029297 0.033203 0.003906 0.000000 0.001953 0.027344
## 5 0.001953 0.021484 0.048828 0.056641 0.019531 0.000000 0.000000 0.013672
## 6 0.001953 0.021484 0.017578 0.041016 0.017578 0.003906 0.001953 0.041016
##   V65
## 1   0
## 2   0
## 3   0
## 4   0
## 5   0
## 6   0
shape_data <- read.csv(shape_file, header=FALSE, sep=",", stringsAsFactors = TRUE)
head(shape_data)
##                V1         V2         V3         V4         V5         V6
## 1 Acer Capillipes 0.00057884 0.00060866 0.00055063 0.00055412 0.00060251
## 2 Acer Capillipes 0.00063028 0.00066074 0.00071871 0.00065149 0.00064287
## 3 Acer Capillipes 0.00061634 0.00061527 0.00060560 0.00056823 0.00055835
## 4 Acer Capillipes 0.00061271 0.00056928 0.00056431 0.00060722 0.00064275
## 5 Acer Capillipes 0.00059946 0.00055240 0.00055795 0.00056888 0.00061560
## 6 Acer Capillipes 0.00055321 0.00058276 0.00052074 0.00057495 0.00060107
##           V7         V8         V9        V10        V11        V12
## 1 0.00061420 0.00061099 0.00061126 0.00061127 0.00059401 0.00056723
## 2 0.00063999 0.00064591 0.00062378 0.00058375 0.00054570 0.00052020
## 3 0.00055227 0.00055119 0.00055162 0.00053110 0.00052965 0.00050656
## 4 0.00064712 0.00066345 0.00065759 0.00063546 0.00060048 0.00058617
## 5 0.00063898 0.00063149 0.00063393 0.00063895 0.00059634 0.00058721
## 6 0.00060068 0.00060566 0.00059552 0.00059341 0.00056913 0.00054535
##          V13        V14        V15        V16        V17        V18
## 1 0.00054544 0.00054852 0.00052220 0.00051996 0.00050029 0.00049550
## 2 0.00049217 0.00047365 0.00045806 0.00044512 0.00041131 0.00041642
## 3 0.00050098 0.00048780 0.00042640 0.00044484 0.00047091 0.00045628
## 4 0.00056935 0.00054666 0.00051782 0.00050056 0.00048341 0.00046331
## 5 0.00057578 0.00056669 0.00053864 0.00052719 0.00051216 0.00050576
## 6 0.00054417 0.00052943 0.00052480 0.00050725 0.00049601 0.00048730
##          V19        V20        V21        V22        V23        V24
## 1 0.00048506 0.00043270 0.00043344 0.00044133 0.00045759 0.00047369
## 2 0.00041792 0.00043338 0.00046667 0.00049444 0.00051226 0.00051463
## 3 0.00047391 0.00050035 0.00053424 0.00053726 0.00058021 0.00059195
## 4 0.00044204 0.00043805 0.00044337 0.00047125 0.00048651 0.00051293
## 5 0.00047713 0.00042037 0.00043828 0.00044307 0.00046131 0.00048500
## 6 0.00043755 0.00043060 0.00043600 0.00044354 0.00047238 0.00048704
##          V25        V26        V27        V28        V29        V30
## 1 0.00050027 0.00051038 0.00055370 0.00057409 0.00061545 0.00063420
## 2 0.00054706 0.00056805 0.00061162 0.00062314 0.00067340 0.00070644
## 3 0.00063824 0.00065655 0.00069176 0.00075707 0.00081739 0.00087267
## 4 0.00050608 0.00054728 0.00057562 0.00061750 0.00060790 0.00066313
## 5 0.00050022 0.00050812 0.00054919 0.00057575 0.00062919 0.00065269
## 6 0.00050947 0.00052215 0.00055134 0.00057878 0.00061888 0.00067277
##          V31        V32        V33        V34        V35        V36
## 1 0.00066996 0.00070086 0.00076520 0.00082210 0.00088670 0.00095273
## 2 0.00074835 0.00079712 0.00085795 0.00091375 0.00097812 0.00095934
## 3 0.00093889 0.00091932 0.00084868 0.00079460 0.00073694 0.00067678
## 4 0.00069860 0.00074795 0.00080933 0.00087363 0.00094199 0.00097837
## 5 0.00069252 0.00072327 0.00078248 0.00083414 0.00088673 0.00095142
## 6 0.00072001 0.00075503 0.00080892 0.00086298 0.00091485 0.00097837
##          V37        V38        V39        V40        V41        V42
## 1 0.00089004 0.00082684 0.00077746 0.00072624 0.00068827 0.00064738
## 2 0.00089228 0.00082245 0.00076643 0.00069780 0.00066379 0.00063080
## 3 0.00064251 0.00060182 0.00058021 0.00056083 0.00051736 0.00050721
## 4 0.00091476 0.00085575 0.00079947 0.00074441 0.00070522 0.00066982
## 5 0.00095801 0.00088576 0.00081584 0.00074559 0.00069396 0.00064167
## 6 0.00091097 0.00084917 0.00078448 0.00072748 0.00069029 0.00064487
##          V43        V44        V45        V46        V47        V48
## 1 0.00060111 0.00059136 0.00054355 0.00051874 0.00051434 0.00048473
## 2 0.00058542 0.00055536 0.00050568 0.00048712 0.00045827 0.00044403
## 3 0.00048000 0.00044296 0.00044281 0.00044033 0.00047195 0.00048522
## 4 0.00063146 0.00059858 0.00056835 0.00052424 0.00051716 0.00048934
## 5 0.00063091 0.00059208 0.00056242 0.00051641 0.00053070 0.00050384
## 6 0.00059307 0.00056657 0.00052382 0.00050103 0.00047318 0.00047589
##          V49        V50        V51        V52        V53        V54
## 1 0.00048040 0.00045491 0.00045303 0.00044640 0.00042522 0.00047467
## 2 0.00041724 0.00041694 0.00041162 0.00041296 0.00045280 0.00046123
## 3 0.00050598 0.00052326 0.00054469 0.00054883 0.00055364 0.00056865
## 4 0.00047944 0.00046101 0.00044595 0.00045062 0.00049919 0.00049343
## 5 0.00047250 0.00045785 0.00044651 0.00043924 0.00049357 0.00049833
## 6 0.00045528 0.00045104 0.00044349 0.00043897 0.00044078 0.00049003
##          V55        V56        V57        V58        V59        V60
## 1 0.00048917 0.00050727 0.00053275 0.00055509 0.00056515 0.00058112
## 2 0.00047762 0.00050331 0.00051962 0.00053258 0.00056427 0.00059566
## 3 0.00059373 0.00058093 0.00058984 0.00058882 0.00056550 0.00057454
## 4 0.00050075 0.00051610 0.00053550 0.00054904 0.00054225 0.00056617
## 5 0.00050411 0.00052932 0.00054267 0.00055699 0.00056183 0.00055837
## 6 0.00050851 0.00052437 0.00052753 0.00054212 0.00057232 0.00056827
##          V61        V62        V63        V64        V65
## 1 0.00059677 0.00062541 0.00062410 0.00061671 0.00061401
## 2 0.00062344 0.00064182 0.00066119 0.00067058 0.00066691
## 3 0.00061797 0.00054334 0.00059248 0.00060658 0.00060235
## 4 0.00059244 0.00060052 0.00060897 0.00061372 0.00060271
## 5 0.00059111 0.00060751 0.00061317 0.00061027 0.00059428
## 6 0.00061607 0.00062150 0.00062151 0.00062510 0.00060623
texture_data <- read.csv(texture_file, header=FALSE, sep=",", stringsAsFactors = TRUE)
head(texture_data)
##               V1       V2       V3       V4       V5       V6 V7       V8
## 1 Acer Campestre 0.025391 0.012695 0.003906 0.004883 0.039062  0 0.017578
## 2 Acer Campestre 0.004883 0.018555 0.002930 0.000000 0.069336  0 0.013672
## 3 Acer Campestre 0.018555 0.013672 0.002930 0.002930 0.051758  0 0.019531
## 4 Acer Campestre 0.035156 0.023438 0.000977 0.000000 0.061523  0 0.021484
## 5 Acer Campestre 0.038086 0.014648 0.003906 0.000977 0.046875  0 0.022461
## 6 Acer Campestre 0.045898 0.017578 0.000000 0.003906 0.044922  0 0.017578
##         V9      V10      V11      V12      V13 V14      V15 V16 V17
## 1 0.035156 0.023438 0.013672 0.000977 0.036133   0 0.000000   0   0
## 2 0.043945 0.026367 0.000000 0.000000 0.395510   0 0.000000   0   0
## 3 0.035156 0.022461 0.000977 0.000000 0.208980   0 0.000000   0   0
## 4 0.061523 0.010742 0.001953 0.000000 0.222660   0 0.000000   0   0
## 5 0.053711 0.019531 0.004883 0.000977 0.076172   0 0.000000   0   0
## 6 0.036133 0.006836 0.002930 0.000000 0.058594   0 0.000977   0   0
##        V18 V19      V20      V21 V22      V23      V24      V25     V26
## 1 0.024414   0 0.028320 0.021484   0 0.003906 0.007812 0.001953 0.00000
## 2 0.032227   0 0.010742 0.027344   0 0.000000 0.032227 0.000977 0.00000
## 3 0.042969   0 0.017578 0.021484   0 0.000000 0.035156 0.003906 0.00000
## 4 0.022461   0 0.008789 0.018555   0 0.000000 0.036133 0.000000 0.00000
## 5 0.019531   0 0.006836 0.035156   0 0.007812 0.033203 0.000000 0.00000
## 6 0.021484   0 0.011719 0.024414   0 0.000000 0.036133 0.003906 0.00293
##        V27      V28 V29      V30      V31      V32 V33 V34      V35
## 1 0.001953 0.044922   0 0.027344 0.000000 0.031250   0   0 0.069336
## 2 0.000000 0.000977   0 0.028320 0.000977 0.027344   0   0 0.018555
## 3 0.000000 0.014648   0 0.023438 0.000000 0.025391   0   0 0.062500
## 4 0.000000 0.000977   0 0.035156 0.000000 0.040039   0   0 0.078125
## 5 0.000000 0.015625   0 0.032227 0.000000 0.049805   0   0 0.073242
## 6 0.000000 0.042969   0 0.018555 0.000000 0.064453   0   0 0.100590
##        V36 V37      V38      V39      V40      V41      V42 V43      V44
## 1 0.000000   0 0.000977 0.027344 0.023438 0.012695 0.006836   0 0.011719
## 2 0.003906   0 0.000000 0.020508 0.000000 0.000977 0.000000   0 0.000977
## 3 0.002930   0 0.000000 0.027344 0.004883 0.007812 0.000000   0 0.012695
## 4 0.002930   0 0.000000 0.039062 0.001953 0.004883 0.000000   0 0.000977
## 5 0.000000   0 0.000000 0.019531 0.016602 0.006836 0.000000   0 0.012695
## 6 0.000977   0 0.000000 0.025391 0.011719 0.008789 0.002930   0 0.023438
##        V45      V46      V47      V48      V49 V50      V51 V52 V53 V54
## 1 0.162110 0.005859 0.022461 0.025391 0.013672   0 0.041016   0   0   0
## 2 0.003906 0.031250 0.014648 0.012695 0.037109   0 0.011719   0   0   0
## 3 0.020508 0.033203 0.016602 0.025391 0.021484   0 0.007812   0   0   0
## 4 0.004883 0.019531 0.025391 0.021484 0.019531   0 0.012695   0   0   0
## 5 0.058594 0.018555 0.029297 0.020508 0.013672   0 0.035156   0   0   0
## 6 0.082031 0.021484 0.031250 0.032227 0.016602   0 0.004883   0   0   0
##        V55      V56 V57      V58      V59      V60 V61 V62      V63 V64
## 1 0.012695 0.103520   0 0.001953 0.000977 0.022461   0   0 0.001953   0
## 2 0.011719 0.070312   0 0.017578 0.000000 0.004883   0   0 0.000000   0
## 3 0.022461 0.156250   0 0.008789 0.000000 0.001953   0   0 0.000000   0
## 4 0.009766 0.105470   0 0.026367 0.000000 0.002930   0   0 0.000000   0
## 5 0.020508 0.150390   0 0.002930 0.000000 0.023438   0   0 0.000000   0
## 6 0.018555 0.110350   0 0.006836 0.000000 0.008789   0   0 0.000977   0
##        V65
## 1 0.027344
## 2 0.002930
## 3 0.005859
## 4 0.022461
## 5 0.015625
## 6 0.030273

We should expect to find equal dimensions for all three datasets.

dim(margin_data)
## [1] 1600   65
dim(shape_data)
## [1] 1600   65
dim(texture_data)
## [1] 1599   65

However, the texture dataset has one row less than other ones. We will fix such issue at a later moment.

We then label the columns for each dataset in order to have proper headers.

n_features <- ncol(margin_data) - 1
colnames(margin_data) <- c("species", paste("margin", as.character(1:n_features), sep=""))
margin_data$species <- factor(margin_data$species)
sum(complete.cases(margin_data)) == nrow(margin_data)
## [1] TRUE
dim(margin_data)
## [1] 1600   65
length(unique(margin_data$species))
## [1] 100
n_features <- ncol(shape_data) - 1
colnames(shape_data) <- c("species", paste("shape", as.character(1:n_features), sep=""))
shape_data$species <- factor(shape_data$species)
sum(complete.cases(shape_data)) == nrow(shape_data)
## [1] TRUE
dim(shape_data)
## [1] 1600   65
length(unique(shape_data$species))
## [1] 100
n_features <- ncol(texture_data) - 1
colnames(texture_data) <- c("species", paste("texture", as.character(1:n_features), sep=""))
texture_data$species <- factor(texture_data$species)
sum(complete.cases(texture_data)) == nrow(texture_data)
## [1] TRUE
dim(texture_data)
## [1] 1599   65
length(unique(texture_data$species))
## [1] 100

To understand which row is missing in the texture dataset, we split each of those as based on species.

margin_count <- sapply(base::split(margin_data, margin_data$species), nrow)
shape_count <- sapply(base::split(shape_data, shape_data$species), nrow)
texture_count <- sapply(base::split(texture_data, texture_data$species), nrow)
which(margin_count != texture_count)
## Acer Campestre 
##              1
which(shape_count != texture_count)
## Acer Campestre 
##              1

There is a missing entry for Acer Campestre in texture dataset, issue that we will fix at a later moment.

Summaries can be run in order to have an idea of margin, shape and texture range values across all the species.

summary(margin_data)
##             species        margin1            margin2        
##  Acer Campestre :  16   Min.   :0.000000   Min.   :0.000000  
##  Acer Capillipes:  16   1st Qu.:0.001953   1st Qu.:0.001953  
##  Acer Circinatum:  16   Median :0.009766   Median :0.011719  
##  Acer Mono      :  16   Mean   :0.017423   Mean   :0.028365  
##  Acer Opalus    :  16   3rd Qu.:0.025391   3rd Qu.:0.039551  
##  Acer Palmatum  :  16   Max.   :0.087891   Max.   :0.205080  
##  (Other)        :1504                                        
##     margin3           margin4            margin5            margin6       
##  Min.   :0.00000   Min.   :0.000000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.01367   1st Qu.:0.005859   1st Qu.:0.001953   1st Qu.:0.00000  
##  Median :0.02344   Median :0.013672   Median :0.007812   Median :0.01367  
##  Mean   :0.03189   Mean   :0.023022   Mean   :0.014315   Mean   :0.03808  
##  3rd Qu.:0.04297   3rd Qu.:0.029297   3rd Qu.:0.019531   3rd Qu.:0.05664  
##  Max.   :0.16797   Max.   :0.169920   Max.   :0.111330   Max.   :0.31055  
##                                                                           
##     margin7            margin8            margin9        
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.005859   1st Qu.:0.000000   1st Qu.:0.001953  
##  Median :0.015625   Median :0.000000   Median :0.005859  
##  Mean   :0.019226   Mean   :0.001084   Mean   :0.007098  
##  3rd Qu.:0.029297   3rd Qu.:0.000000   3rd Qu.:0.007812  
##  Max.   :0.091797   Max.   :0.031250   Max.   :0.083984  
##                                                          
##     margin10           margin11           margin12       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.005859   1st Qu.:0.003906   1st Qu.:0.001953  
##  Median :0.015625   Median :0.013672   Median :0.007812  
##  Mean   :0.018739   Mean   :0.024265   Mean   :0.012054  
##  3rd Qu.:0.027344   3rd Qu.:0.041016   3rd Qu.:0.019531  
##  Max.   :0.097656   Max.   :0.125000   Max.   :0.056641  
##                                                          
##     margin13           margin14           margin15       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.009766   1st Qu.:0.000000   1st Qu.:0.001953  
##  Median :0.027344   Median :0.001953   Median :0.011719  
##  Mean   :0.041259   Mean   :0.007704   Mean   :0.015629  
##  3rd Qu.:0.062500   3rd Qu.:0.007812   3rd Qu.:0.025391  
##  Max.   :0.416020   Max.   :0.082031   Max.   :0.066406  
##                                                          
##     margin16           margin17           margin18       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.005859   1st Qu.:0.005859  
##  Median :0.000000   Median :0.013672   Median :0.015625  
##  Mean   :0.000155   Mean   :0.015164   Mean   :0.020378  
##  3rd Qu.:0.000000   3rd Qu.:0.021484   3rd Qu.:0.029297  
##  Max.   :0.029297   Max.   :0.064453   Max.   :0.359380  
##                                                          
##     margin19           margin20           margin21       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.003906   1st Qu.:0.005859   1st Qu.:0.003906  
##  Median :0.007812   Median :0.011719   Median :0.013672  
##  Mean   :0.012681   Mean   :0.013242   Mean   :0.019042  
##  3rd Qu.:0.017578   3rd Qu.:0.017578   3rd Qu.:0.031250  
##  Max.   :0.125000   Max.   :0.056641   Max.   :0.101560  
##                                                          
##     margin22           margin23           margin24       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.001953   Median :0.000000   Median :0.001953  
##  Mean   :0.006163   Mean   :0.001034   Mean   :0.007511  
##  3rd Qu.:0.007812   3rd Qu.:0.000000   3rd Qu.:0.009766  
##  Max.   :0.062500   Max.   :0.046875   Max.   :0.078125  
##                                                          
##     margin25           margin26           margin27       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.009766   1st Qu.:0.000000  
##  Median :0.001953   Median :0.017578   Median :0.000000  
##  Mean   :0.008899   Mean   :0.018656   Mean   :0.005560  
##  3rd Qu.:0.013672   3rd Qu.:0.025391   3rd Qu.:0.005859  
##  Max.   :0.082031   Max.   :0.085938   Max.   :0.082031  
##                                                          
##     margin28           margin29          margin30       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.003906   1st Qu.:0.01172   1st Qu.:0.005859  
##  Median :0.011719   Median :0.02344   Median :0.011719  
##  Mean   :0.015464   Mean   :0.02844   Mean   :0.016312  
##  3rd Qu.:0.023438   3rd Qu.:0.04102   3rd Qu.:0.021484  
##  Max.   :0.080078   Max.   :0.14844   Max.   :0.119140  
##                                                         
##     margin31           margin32           margin33       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.005859  
##  Median :0.003906   Median :0.001953   Median :0.015625  
##  Mean   :0.010898   Mean   :0.009730   Mean   :0.019579  
##  3rd Qu.:0.013672   3rd Qu.:0.009766   3rd Qu.:0.029297  
##  Max.   :0.113280   Max.   :0.126950   Max.   :0.083984  
##                                                          
##     margin34           margin35           margin36       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.001953   1st Qu.:0.003906  
##  Median :0.000000   Median :0.009766   Median :0.013672  
##  Mean   :0.001135   Mean   :0.013522   Mean   :0.017993  
##  3rd Qu.:0.001953   3rd Qu.:0.021484   3rd Qu.:0.027344  
##  Max.   :0.029297   Max.   :0.082031   Max.   :0.089844  
##                                                          
##     margin37           margin38           margin39       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.007812   1st Qu.:0.009766   1st Qu.:0.003906  
##  Median :0.013672   Median :0.028320   Median :0.011719  
##  Mean   :0.016080   Mean   :0.030906   Mean   :0.014969  
##  3rd Qu.:0.021484   3rd Qu.:0.046875   3rd Qu.:0.023438  
##  Max.   :0.068359   Max.   :0.140630   Max.   :0.066406  
##                                                          
##     margin40           margin41           margin42       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.005859  
##  Median :0.003906   Median :0.001953   Median :0.013672  
##  Mean   :0.008188   Mean   :0.010566   Mean   :0.017839  
##  3rd Qu.:0.011719   3rd Qu.:0.009766   3rd Qu.:0.025391  
##  Max.   :0.072266   Max.   :0.208980   Max.   :0.091797  
##                                                          
##     margin43           margin44           margin45       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.003906   1st Qu.:0.003906   1st Qu.:0.005859  
##  Median :0.009766   Median :0.009766   Median :0.017578  
##  Mean   :0.018823   Mean   :0.012565   Mean   :0.024922  
##  3rd Qu.:0.027344   3rd Qu.:0.017578   3rd Qu.:0.035156  
##  Max.   :0.117190   Max.   :0.064453   Max.   :0.177730  
##                                                          
##     margin46           margin47           margin48       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.005859   1st Qu.:0.001953  
##  Median :0.003906   Median :0.021484   Median :0.017578  
##  Mean   :0.010102   Mean   :0.025510   Mean   :0.027270  
##  3rd Qu.:0.015625   3rd Qu.:0.041016   3rd Qu.:0.044922  
##  Max.   :0.089844   Max.   :0.105470   Max.   :0.183590  
##                                                          
##     margin49           margin50           margin51       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.005859  
##  Median :0.001953   Median :0.005859   Median :0.019531  
##  Mean   :0.008529   Mean   :0.013793   Mean   :0.025535  
##  3rd Qu.:0.009766   3rd Qu.:0.021484   3rd Qu.:0.039062  
##  Max.   :0.107420   Max.   :0.091797   Max.   :0.130860  
##                                                          
##     margin52           margin53           margin54       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.009766   1st Qu.:0.000000  
##  Median :0.000000   Median :0.021484   Median :0.002929  
##  Mean   :0.002682   Mean   :0.024270   Mean   :0.010022  
##  3rd Qu.:0.001953   3rd Qu.:0.035156   3rd Qu.:0.013672  
##  Max.   :0.083984   Max.   :0.097656   Max.   :0.128910  
##                                                          
##     margin55           margin56           margin57       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.001953   1st Qu.:0.000000   1st Qu.:0.005859  
##  Median :0.011719   Median :0.001953   Median :0.009766  
##  Mean   :0.018063   Mean   :0.005997   Mean   :0.012411  
##  3rd Qu.:0.027344   3rd Qu.:0.007812   3rd Qu.:0.017578  
##  Max.   :0.140630   Max.   :0.054688   Max.   :0.062500  
##                                                          
##     margin58           margin59          margin60       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.005859   1st Qu.:0.01172   1st Qu.:0.003906  
##  Median :0.015625   Median :0.02344   Median :0.007812  
##  Mean   :0.020353   Mean   :0.03110   Mean   :0.011927  
##  3rd Qu.:0.029297   3rd Qu.:0.03906   3rd Qu.:0.015625  
##  Max.   :0.113280   Max.   :0.25195   Max.   :0.089844  
##                                                         
##     margin61           margin62           margin63       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.009766  
##  Median :0.000000   Median :0.000000   Median :0.021484  
##  Mean   :0.001387   Mean   :0.004972   Mean   :0.025350  
##  3rd Qu.:0.001953   3rd Qu.:0.003906   3rd Qu.:0.037109  
##  Max.   :0.042969   Max.   :0.076172   Max.   :0.125000  
##                                                          
##     margin64       
##  Min.   :0.000000  
##  1st Qu.:0.000000  
##  Median :0.001953  
##  Mean   :0.004163  
##  3rd Qu.:0.003906  
##  Max.   :0.089844  
## 
summary(shape_data)
##             species         shape1              shape2         
##  Acer Campestre :  16   Min.   :0.0001680   Min.   :0.0001816  
##  Acer Capillipes:  16   1st Qu.:0.0005203   1st Qu.:0.0005067  
##  Acer Circinatum:  16   Median :0.0007098   Median :0.0006896  
##  Acer Mono      :  16   Mean   :0.0007366   Mean   :0.0007152  
##  Acer Opalus    :  16   3rd Qu.:0.0009275   3rd Qu.:0.0008980  
##  Acer Palmatum  :  16   Max.   :0.0023897   Max.   :0.0022466  
##  (Other)        :1504                                          
##      shape3              shape4              shape5         
##  Min.   :0.0001482   Min.   :0.0001037   Min.   :0.0001198  
##  1st Qu.:0.0004968   1st Qu.:0.0004832   1st Qu.:0.0004782  
##  Median :0.0006709   Median :0.0006516   Median :0.0006365  
##  Mean   :0.0006902   Mean   :0.0006669   Mean   :0.0006457  
##  3rd Qu.:0.0008640   3rd Qu.:0.0008330   3rd Qu.:0.0007902  
##  Max.   :0.0021123   Max.   :0.0019977   Max.   :0.0021510  
##                                                             
##      shape6              shape7              shape8         
##  Min.   :0.0001183   Min.   :8.493e-05   Min.   :6.811e-05  
##  1st Qu.:0.0004651   1st Qu.:4.554e-04   1st Qu.:4.409e-04  
##  Median :0.0006180   Median :5.992e-04   Median :5.833e-04  
##  Mean   :0.0006275   Mean   :6.124e-04   Mean   :5.998e-04  
##  3rd Qu.:0.0007536   3rd Qu.:7.246e-04   3rd Qu.:7.050e-04  
##  Max.   :0.0022666   Max.   :2.305e-03   Max.   :2.439e-03  
##                                                             
##      shape9             shape10             shape11         
##  Min.   :8.752e-05   Min.   :0.0001088   Min.   :7.599e-05  
##  1st Qu.:4.137e-04   1st Qu.:0.0003952   1st Qu.:3.766e-04  
##  Median :5.603e-04   Median :0.0005323   Median :5.006e-04  
##  Mean   :5.871e-04   Mean   :0.0005751   Mean   :5.656e-04  
##  3rd Qu.:6.893e-04   3rd Qu.:0.0006816   3rd Qu.:6.691e-04  
##  Max.   :2.521e-03   Max.   :0.0026874   Max.   :2.774e-03  
##                                                             
##     shape12             shape13             shape14         
##  Min.   :0.0001063   Min.   :8.088e-05   Min.   :9.678e-05  
##  1st Qu.:0.0003582   1st Qu.:3.360e-04   1st Qu.:3.246e-04  
##  Median :0.0004754   Median :4.552e-04   Median :4.423e-04  
##  Mean   :0.0005580   Mean   :5.522e-04   Mean   :5.475e-04  
##  3rd Qu.:0.0006587   3rd Qu.:6.546e-04   3rd Qu.:6.391e-04  
##  Max.   :0.0028893   Max.   :3.007e-03   Max.   :2.815e-03  
##                                                             
##     shape15             shape16             shape17         
##  Min.   :7.069e-05   Min.   :6.719e-05   Min.   :4.453e-05  
##  1st Qu.:3.157e-04   1st Qu.:3.104e-04   1st Qu.:3.076e-04  
##  Median :4.305e-04   Median :4.220e-04   Median :4.255e-04  
##  Mean   :5.429e-04   Mean   :5.409e-04   Mean   :5.410e-04  
##  3rd Qu.:6.363e-04   3rd Qu.:6.350e-04   3rd Qu.:6.349e-04  
##  Max.   :2.719e-03   Max.   :2.532e-03   Max.   :2.410e-03  
##                                                             
##     shape18             shape19             shape20         
##  Min.   :0.0000376   Min.   :2.999e-05   Min.   :3.977e-05  
##  1st Qu.:0.0003120   1st Qu.:3.196e-04   1st Qu.:3.321e-04  
##  Median :0.0004272   Median :4.340e-04   Median :4.476e-04  
##  Mean   :0.0005426   Mean   :5.439e-04   Mean   :5.481e-04  
##  3rd Qu.:0.0006301   3rd Qu.:6.244e-04   3rd Qu.:6.208e-04  
##  Max.   :0.0025374   Max.   :2.475e-03   Max.   :2.383e-03  
##                                                             
##     shape21             shape22             shape23         
##  Min.   :5.501e-05   Min.   :4.503e-05   Min.   :4.638e-05  
##  1st Qu.:3.501e-04   1st Qu.:3.734e-04   1st Qu.:4.022e-04  
##  Median :4.712e-04   Median :4.933e-04   Median :5.189e-04  
##  Mean   :5.556e-04   Mean   :5.650e-04   Mean   :5.767e-04  
##  3rd Qu.:6.323e-04   3rd Qu.:6.486e-04   3rd Qu.:6.662e-04  
##  Max.   :2.274e-03   Max.   :2.420e-03   Max.   :2.553e-03  
##                                                             
##     shape24             shape25             shape26         
##  Min.   :5.573e-05   Min.   :5.972e-05   Min.   :3.117e-05  
##  1st Qu.:4.184e-04   1st Qu.:4.355e-04   1st Qu.:4.569e-04  
##  Median :5.475e-04   Median :5.713e-04   Median :5.957e-04  
##  Mean   :5.882e-04   Mean   :6.009e-04   Mean   :6.149e-04  
##  3rd Qu.:6.786e-04   3rd Qu.:7.093e-04   3rd Qu.:7.338e-04  
##  Max.   :2.434e-03   Max.   :2.274e-03   Max.   :2.130e-03  
##                                                             
##     shape27             shape28             shape29         
##  Min.   :8.798e-05   Min.   :8.671e-05   Min.   :5.583e-05  
##  1st Qu.:4.687e-04   1st Qu.:4.745e-04   1st Qu.:4.799e-04  
##  Median :6.162e-04   Median :6.420e-04   Median :6.643e-04  
##  Mean   :6.290e-04   Mean   :6.449e-04   Mean   :6.632e-04  
##  3rd Qu.:7.614e-04   3rd Qu.:7.965e-04   3rd Qu.:8.299e-04  
##  Max.   :1.988e-03   Max.   :1.833e-03   Max.   :1.824e-03  
##                                                             
##     shape30             shape31             shape32         
##  Min.   :5.366e-05   Min.   :6.743e-05   Min.   :5.315e-05  
##  1st Qu.:4.930e-04   1st Qu.:5.096e-04   1st Qu.:5.214e-04  
##  Median :6.838e-04   Median :7.018e-04   Median :7.199e-04  
##  Mean   :6.851e-04   Mean   :7.083e-04   Mean   :7.300e-04  
##  3rd Qu.:8.623e-04   3rd Qu.:8.952e-04   3rd Qu.:9.243e-04  
##  Max.   :1.855e-03   Max.   :2.002e-03   Max.   :2.124e-03  
##                                                             
##     shape33             shape34             shape35         
##  Min.   :9.131e-05   Min.   :5.456e-05   Min.   :3.387e-05  
##  1st Qu.:5.213e-04   1st Qu.:5.121e-04   1st Qu.:5.025e-04  
##  Median :7.223e-04   Median :7.093e-04   Median :6.760e-04  
##  Mean   :7.345e-04   Mean   :7.136e-04   Mean   :6.858e-04  
##  3rd Qu.:9.313e-04   3rd Qu.:8.991e-04   3rd Qu.:8.628e-04  
##  Max.   :2.111e-03   Max.   :1.988e-03   Max.   :1.868e-03  
##                                                             
##     shape36             shape37             shape38         
##  Min.   :4.618e-05   Min.   :6.663e-05   Min.   :6.401e-05  
##  1st Qu.:4.850e-04   1st Qu.:4.741e-04   1st Qu.:4.590e-04  
##  Median :6.544e-04   Median :6.378e-04   Median :6.195e-04  
##  Mean   :6.608e-04   Mean   :6.382e-04   Mean   :6.174e-04  
##  3rd Qu.:8.276e-04   3rd Qu.:7.878e-04   3rd Qu.:7.514e-04  
##  Max.   :1.886e-03   Max.   :2.011e-03   Max.   :2.107e-03  
##                                                             
##     shape39             shape40             shape41         
##  Min.   :0.0000623   Min.   :6.818e-05   Min.   :7.033e-05  
##  1st Qu.:0.0004462   1st Qu.:4.290e-04   1st Qu.:4.162e-04  
##  Median :0.0005905   Median :5.677e-04   Median :5.504e-04  
##  Mean   :0.0005989   Mean   :5.817e-04   Mean   :5.679e-04  
##  3rd Qu.:0.0007252   3rd Qu.:7.019e-04   3rd Qu.:6.873e-04  
##  Max.   :0.0021057   Max.   :2.195e-03   Max.   :2.230e-03  
##                                                             
##     shape42             shape43             shape44         
##  Min.   :5.189e-05   Min.   :3.776e-05   Min.   :3.532e-05  
##  1st Qu.:3.964e-04   1st Qu.:3.744e-04   1st Qu.:3.533e-04  
##  Median :5.281e-04   Median :5.084e-04   Median :4.894e-04  
##  Mean   :5.579e-04   Mean   :5.517e-04   Mean   :5.473e-04  
##  3rd Qu.:6.641e-04   3rd Qu.:6.468e-04   3rd Qu.:6.391e-04  
##  Max.   :2.285e-03   Max.   :2.324e-03   Max.   :2.348e-03  
##                                                             
##     shape45             shape46             shape47         
##  Min.   :0.0000429   Min.   :5.534e-05   Min.   :4.349e-05  
##  1st Qu.:0.0003457   1st Qu.:3.323e-04   1st Qu.:3.210e-04  
##  Median :0.0004633   Median :4.465e-04   Median :4.421e-04  
##  Mean   :0.0005448   Mean   :5.430e-04   Mean   :5.431e-04  
##  3rd Qu.:0.0006425   3rd Qu.:6.459e-04   3rd Qu.:6.475e-04  
##  Max.   :0.0024146   Max.   :2.378e-03   Max.   :2.342e-03  
##                                                             
##     shape48             shape49             shape50         
##  Min.   :3.953e-05   Min.   :2.167e-05   Min.   :6.256e-05  
##  1st Qu.:3.128e-04   1st Qu.:3.114e-04   1st Qu.:3.153e-04  
##  Median :4.356e-04   Median :4.408e-04   Median :4.409e-04  
##  Mean   :5.433e-04   Mean   :5.467e-04   Mean   :5.482e-04  
##  3rd Qu.:6.631e-04   3rd Qu.:6.684e-04   3rd Qu.:6.736e-04  
##  Max.   :2.458e-03   Max.   :2.315e-03   Max.   :2.259e-03  
##                                                             
##     shape51             shape52             shape53         
##  Min.   :9.222e-05   Min.   :0.0001067   Min.   :8.834e-05  
##  1st Qu.:3.211e-04   1st Qu.:0.0003351   1st Qu.:3.519e-04  
##  Median :4.500e-04   Median :0.0004544   Median :4.707e-04  
##  Mean   :5.479e-04   Mean   :0.0005482   Mean   :5.515e-04  
##  3rd Qu.:6.706e-04   3rd Qu.:0.0006591   3rd Qu.:6.532e-04  
##  Max.   :2.374e-03   Max.   :0.0022932   Max.   :2.166e-03  
##                                                             
##     shape54             shape55             shape56         
##  Min.   :9.699e-05   Min.   :0.0001084   Min.   :6.926e-05  
##  1st Qu.:3.668e-04   1st Qu.:0.0003825   1st Qu.:4.032e-04  
##  Median :4.984e-04   Median :0.0005194   Median :5.421e-04  
##  Mean   :5.557e-04   Mean   :0.0005620   Mean   :5.691e-04  
##  3rd Qu.:6.546e-04   3rd Qu.:0.0006630   3rd Qu.:6.734e-04  
##  Max.   :2.130e-03   Max.   :0.0022557   Max.   :2.132e-03  
##                                                             
##     shape57             shape58             shape59         
##  Min.   :5.509e-05   Min.   :3.757e-05   Min.   :0.0000396  
##  1st Qu.:4.123e-04   1st Qu.:4.268e-04   1st Qu.:0.0004392  
##  Median :5.608e-04   Median :5.868e-04   Median :0.0006083  
##  Mean   :5.766e-04   Mean   :5.884e-04   Mean   :0.0006043  
##  3rd Qu.:6.886e-04   3rd Qu.:7.095e-04   3rd Qu.:0.0007369  
##  Max.   :1.990e-03   Max.   :1.861e-03   Max.   :0.0017404  
##                                                             
##     shape60             shape61             shape62         
##  Min.   :4.163e-05   Min.   :8.676e-05   Min.   :0.0001340  
##  1st Qu.:4.538e-04   1st Qu.:4.623e-04   1st Qu.:0.0004755  
##  Median :6.296e-04   Median :6.472e-04   Median :0.0006668  
##  Mean   :6.236e-04   Mean   :6.454e-04   Mean   :0.0006714  
##  3rd Qu.:7.737e-04   3rd Qu.:8.080e-04   3rd Qu.:0.0008439  
##  Max.   :1.760e-03   Max.   :1.922e-03   Max.   :0.0020901  
##                                                             
##     shape63             shape64         
##  Min.   :0.0001685   Min.   :0.0001659  
##  1st Qu.:0.0004985   1st Qu.:0.0005112  
##  Median :0.0006884   Median :0.0007058  
##  Mean   :0.0007009   Mean   :0.0007289  
##  3rd Qu.:0.0008825   3rd Qu.:0.0009208  
##  Max.   :0.0022627   Max.   :0.0024313  
## 
summary(texture_data)
##             species        texture1           texture2       
##  Acer Capillipes:  16   Min.   :0.000000   Min.   :0.000000  
##  Acer Circinatum:  16   1st Qu.:0.000000   1st Qu.:0.000977  
##  Acer Mono      :  16   Median :0.006836   Median :0.004883  
##  Acer Opalus    :  16   Mean   :0.021376   Mean   :0.011693  
##  Acer Palmatum  :  16   3rd Qu.:0.021972   3rd Qu.:0.014648  
##  Acer Pictum    :  16   Max.   :0.413090   Max.   :0.147460  
##  (Other)        :1503                                        
##     texture3           texture4           texture5       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.005859   Median :0.005859   Median :0.006836  
##  Mean   :0.010230   Mean   :0.015315   Mean   :0.026646  
##  3rd Qu.:0.014648   3rd Qu.:0.021484   3rd Qu.:0.044922  
##  Max.   :0.118160   Max.   :0.161130   Max.   :0.201170  
##                                                          
##     texture6           texture7           texture8       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.001953   1st Qu.:0.000000  
##  Median :0.000977   Median :0.009766   Median :0.008789  
##  Mean   :0.009449   Mean   :0.016449   Mean   :0.019495  
##  3rd Qu.:0.008789   3rd Qu.:0.022461   3rd Qu.:0.030273  
##  Max.   :0.146480   Max.   :0.183590   Max.   :0.176760  
##                                                          
##     texture9          texture10          texture11       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000977   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.006836   Median :0.003906   Median :0.003906  
##  Mean   :0.015011   Mean   :0.019787   Mean   :0.019310  
##  3rd Qu.:0.020508   3rd Qu.:0.022461   3rd Qu.:0.025391  
##  Max.   :0.196290   Max.   :0.315430   Max.   :0.285160  
##                                                          
##    texture12          texture13          texture14       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.000000   Median :0.004883   Median :0.006836  
##  Mean   :0.027292   Mean   :0.009721   Mean   :0.012895  
##  3rd Qu.:0.007812   3rd Qu.:0.014648   3rd Qu.:0.018555  
##  Max.   :0.507810   Max.   :0.111330   Max.   :0.184570  
##                                                          
##    texture15         texture16          texture17       
##  Min.   :0.00000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.00000   Median :0.000000   Median :0.003906  
##  Mean   :0.01272   Mean   :0.004877   Mean   :0.014521  
##  3rd Qu.:0.00000   3rd Qu.:0.002930   3rd Qu.:0.017578  
##  Max.   :0.85352   Max.   :0.119140   Max.   :0.305660  
##                                                         
##    texture18          texture19         texture20       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.00293   1st Qu.:0.002930  
##  Median :0.000000   Median :0.01367   Median :0.008789  
##  Mean   :0.005795   Mean   :0.02442   Mean   :0.014532  
##  3rd Qu.:0.004883   3rd Qu.:0.03564   3rd Qu.:0.020508  
##  Max.   :0.208980   Max.   :0.30078   Max.   :0.134770  
##                                                         
##    texture21          texture22         texture23       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.001953  
##  Median :0.000000   Median :0.00293   Median :0.007812  
##  Mean   :0.002899   Mean   :0.01319   Mean   :0.017030  
##  3rd Qu.:0.000000   3rd Qu.:0.01758   3rd Qu.:0.022461  
##  Max.   :0.097656   Max.   :0.17773   Max.   :0.195310  
##                                                         
##    texture24          texture25          texture26       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.001953   Median :0.003906   Median :0.004883  
##  Mean   :0.010304   Mean   :0.010450   Mean   :0.023417  
##  3rd Qu.:0.011719   3rd Qu.:0.014648   3rd Qu.:0.027344  
##  Max.   :0.134770   Max.   :0.153320   Max.   :0.302730  
##                                                          
##    texture27          texture28          texture29       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.003906  
##  Median :0.006836   Median :0.001953   Median :0.012695  
##  Mean   :0.021175   Mean   :0.010016   Mean   :0.021362  
##  3rd Qu.:0.030273   3rd Qu.:0.010742   3rd Qu.:0.030273  
##  Max.   :0.227540   Max.   :0.170900   Max.   :0.214840  
##                                                          
##    texture30          texture31          texture32       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000977   1st Qu.:0.004883   1st Qu.:0.000000  
##  Median :0.007812   Median :0.015625   Median :0.000000  
##  Mean   :0.011489   Mean   :0.024020   Mean   :0.006121  
##  3rd Qu.:0.016602   3rd Qu.:0.033203   3rd Qu.:0.002930  
##  Max.   :0.085938   Max.   :0.285160   Max.   :0.128910  
##                                                          
##    texture33         texture34         texture35          texture36       
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.000977   1st Qu.:0.000000  
##  Median :0.00000   Median :0.01074   Median :0.006836   Median :0.000000  
##  Mean   :0.02210   Mean   :0.02676   Mean   :0.010475   Mean   :0.003334  
##  3rd Qu.:0.01172   3rd Qu.:0.03418   3rd Qu.:0.014648   3rd Qu.:0.000000  
##  Max.   :0.46094   Max.   :0.35352   Max.   :0.085938   Max.   :0.115230  
##                                                                           
##    texture37         texture38          texture39       
##  Min.   :0.00000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.000977   1st Qu.:0.000977  
##  Median :0.00000   Median :0.016602   Median :0.008789  
##  Mean   :0.02091   Mean   :0.023625   Mean   :0.017261  
##  3rd Qu.:0.01074   3rd Qu.:0.036133   3rd Qu.:0.026367  
##  Max.   :0.55664   Max.   :0.217770   Max.   :0.147460  
##                                                         
##    texture40          texture41         texture42       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.001953   1st Qu.:0.00000   1st Qu.:0.000000  
##  Median :0.009766   Median :0.00000   Median :0.000977  
##  Mean   :0.019240   Mean   :0.01573   Mean   :0.006061  
##  3rd Qu.:0.026367   3rd Qu.:0.01074   3rd Qu.:0.006836  
##  Max.   :0.166020   Max.   :0.41113   Max.   :0.075195  
##                                                         
##    texture43          texture44          texture45       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000977   1st Qu.:0.000000   1st Qu.:0.001465  
##  Median :0.007812   Median :0.005859   Median :0.007812  
##  Mean   :0.014154   Mean   :0.026033   Mean   :0.016452  
##  3rd Qu.:0.020508   3rd Qu.:0.031250   3rd Qu.:0.022461  
##  Max.   :0.176760   Max.   :0.412110   Max.   :0.137700  
##                                                          
##    texture46         texture47          texture48       
##  Min.   :0.00000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.004883   1st Qu.:0.000000  
##  Median :0.01172   Median :0.015625   Median :0.000977  
##  Mean   :0.02188   Mean   :0.018734   Mean   :0.016594  
##  3rd Qu.:0.03516   3rd Qu.:0.029297   3rd Qu.:0.021972  
##  Max.   :0.17578   Max.   :0.098633   Max.   :0.249020  
##                                                         
##    texture49          texture50          texture51       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000977   1st Qu.:0.001953   1st Qu.:0.000000  
##  Median :0.007812   Median :0.009766   Median :0.000000  
##  Mean   :0.011642   Mean   :0.019195   Mean   :0.013255  
##  3rd Qu.:0.017578   3rd Qu.:0.024414   3rd Qu.:0.008301  
##  Max.   :0.114260   Max.   :0.232420   Max.   :0.390630  
##                                                          
##    texture52          texture53          texture54      
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.00293  
##  Median :0.000000   Median :0.004883   Median :0.01269  
##  Mean   :0.007108   Mean   :0.014275   Mean   :0.02243  
##  3rd Qu.:0.009766   3rd Qu.:0.020508   3rd Qu.:0.03320  
##  Max.   :0.117190   Max.   :0.165040   Max.   :0.29297  
##                                                         
##    texture55          texture56          texture57       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000977  
##  Median :0.004883   Median :0.000000   Median :0.005859  
##  Mean   :0.036817   Mean   :0.005311   Mean   :0.015576  
##  3rd Qu.:0.045410   3rd Qu.:0.000000   3rd Qu.:0.020508  
##  Max.   :0.429690   Max.   :0.441410   Max.   :0.172850  
##                                                          
##    texture58          texture59          texture60      
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.000000   1st Qu.:0.004883   1st Qu.:0.00000  
##  Median :0.000977   Median :0.012695   Median :0.00000  
##  Mean   :0.011543   Mean   :0.015980   Mean   :0.01285  
##  3rd Qu.:0.009766   3rd Qu.:0.021484   3rd Qu.:0.00000  
##  Max.   :0.200200   Max.   :0.106450   Max.   :0.60645  
##                                                         
##    texture61          texture62          texture63       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000000  
##  Median :0.000000   Median :0.003906   Median :0.002930  
##  Mean   :0.002637   Mean   :0.019987   Mean   :0.009054  
##  3rd Qu.:0.000000   3rd Qu.:0.022461   3rd Qu.:0.013184  
##  Max.   :0.151370   Max.   :0.375980   Max.   :0.086914  
##                                                          
##    texture64       
##  Min.   :0.000000  
##  1st Qu.:0.000977  
##  Median :0.012695  
##  Mean   :0.019982  
##  3rd Qu.:0.031250  
##  Max.   :0.149410  
## 

Boxplot can be used to qualitatively understand if there is separation of values range across species for each specific feature. Herein we do such analysis for a few features (due to limit on publishable post size). We define an helper function for the purpose.

species_boxplot <- function(dataset, variable) {
  p <- ggplot(data = dataset, aes(x = species, y = eval(parse(text=variable)), col= species)) + theme(legend.position = "none") + geom_boxplot() + ylab(parse(text=variable))
  p <- p + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + ggtitle(paste(variable, "among species", sep = " "))
  p
}
species_boxplot(margin_data, "margin1")
species_boxplot(shape_data, "shape20")
species_boxplot(texture_data, "texture30")

Visual inspection of above boxplot suggests we have available potential features to help in prediction, as range of values tends to be different across species.

We then delete the 16th row from margin and shape datasets congruently with having a missing entry for texture dataset at the same row number.

margin_data <- margin_data[-16,]
shape_data <- shape_data[-16,]

Let us add an identifier as associated to each row for all three datasets. That identifier eases table join operations.

margin_data <- mutate(margin_data, id = 1:nrow(margin_data))
shape_data <- mutate(shape_data, id = 1:nrow(shape_data))
texture_data <- mutate(texture_data, id = 1:nrow(texture_data))
shape_data_2 <- shape_data
shape_data_2$species <- NULL

texture_data_2 <- texture_data
texture_data_2$species <- NULL

leaves_data <- margin_data
leaves_data <- left_join(leaves_data, shape_data_2, by = "id")
leaves_data <- left_join(leaves_data, texture_data_2, by = "id")

dim(leaves_data)
## [1] 1599  194

Saving the current enviroment for further analysis that will be outlined in my next post.

save.image(file='PlantLeafEnvironment.RData')

References

[1] Charles Mallah, James Cope and James Orwell, "Plant leaf classification using probabilistic integration of shape, texture and margin features" [https://www.researchgate.net/publication/266632357_Plant_Leaf_Classification_using_Probabilistic_Integration_of_Shape_Texture_and_Margin_Features]

[2] 100 Plant Leaf Dataset [https://archive.ics.uci.edu/ml/datasets/One-hundred+plant+species+leaves+data+set]

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.