This notebook describes an example of using the
caret1 package to conduct hyperparameter tuning for the k-Nearest Neighbour classifier.
The example dataset is the
banknote dataframe found in the
mclust2 package. It contains six measurements made on 100 genuine and 100 counterfeit old-Swiss 1000-franc bank notes.
There are six predictor variables (
Status being the categorical response or class variable having two levels, namely
Observe that the dataset is balanced with 100 observations against each level of
In most of the measurements of bank notes aside from
Length, genuine and counterfeit notes have quite distinct distributions.
library(tidyr) banknote %>% mutate(ID = 1:n()) %>% pivot_longer(Length:Diagonal, names_to = "Dimension", values_to = "Size") %>% mutate(Dimension = factor(Dimension), ID = factor(ID)) %>% ggplot() + aes(y = Size, fill = Status) + facet_wrap(~ Dimension, scales = "free") + geom_boxplot() + theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) + labs(y = "Size (mm)", title = "Comparison of bank note dimensions")