MthStat 568/768 �Multivariate Statistical Analysis �Spring 2025
Homework 6
Due Wednesday, April 23
1. Consider the pendigits data set, which are samples of handwritten digits 0; 1; : : : ; 9.
The feature variables in this case are the (x; y) coordinates of the pen tip, dis-
cretized at eight time points (see section 7.2.1 of the book for more details).
(a) Split the data set into training and test sets (roughly a 70/30 split). Com-
pute Gaussian-kernel support vector machine classi�ers for several values
of the tuning parameter
. For each
, �nd the misclassi�cation rate on
the test set. What�s the lowest MCR attained?
(b) For the optimal
found in (a), construct the misclassi�cation table for
the test data. Which digit seems to be the hardest to classify correctly?