Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Feature selection is the task of selecting a small subset from original features
that can achieve maximum classification accuracy. This subset of features has
some very important benefits like, it reduces computational complexity of learning
algorithms, saves time, improve accuracy and the selected features can be
insightful for the people involved in problem domain. This makes feature selection
as an indispensable task in classification task.
This dissertation presents a two phase approach for feature selection. In the
first phase a filter method is used with “correlation coefficient” and “mutual
information” as statistical measure of similarity. This phase helps in improving
the classification performance by removing redundant and unimportant
features. A wrapper method is used in the second phase with the sequential
forward selection and sequential backward elimination. This phase helps in selecting
relevant feature subset that produce maximum accuracy according to
the underlying classifier. The Support Vector Machine (SVM) classifier (linear
and nonlinear) is used to evaluate the classification accuracy of our approach.
This empirical results of commonly used data sets from the University of
California, Irvine repository and microarray data sets showed that the proposed
method performs better in terms of classification accuracy, number of selected
features, and computational efficiency.
2011. , 104 p.