Improving Image Classification Performance using Joint Feature Selection
2014 (English)Doctoral thesis, monograph (Other academic)
In this thesis, we focus on the problem of image classification and investigate how its performance can be systematically improved. Improving the performance of different computer vision methods has been the subject of many studies. While different studies take different approaches to achieve this improvement, in this thesis we address this problem by investigating the relevance of the statistics collected from the image.
We propose a framework for gradually improving the quality of an already existing image descriptor. In our studies, we employ a descriptor which is composed the response of a series of discriminative components for summarizing each image. As we will show, this descriptor has an ideal form in which all categories become linearly separable. While, reaching this form is not possible, we will argue how by replacing a small fraction of these components, it is possible to obtain a descriptor which is, on average, closer to this ideal form. To do so, we initially identify which components do not contribute to the quality of the descriptor and replace them with more robust components. As we will show, this replacement has a positive effect on the quality of the descriptor.
While there are many ways of obtaining more robust components, we introduce a joint feature selection problem to obtain image features that retains class discriminative properties while simultaneously generalising between within class variations. Our approach is based on the concept of a joint feature where several small features are combined in a spatial structure. The proposed framework automatically learns the structure of the joint constellations in a class dependent manner improving the generalisation and discrimination capabilities of the local descriptor while still retaining a low-dimensional representations.
The joint feature selection problem discussed in this thesis belongs to a specific class of latent variable models that assumes each labeled sample is associated with a set of different features, with no prior knowledge of which feature is the most relevant feature to be used. Deformable-Part Models (DPM) can be seen as good examples of such models. These models are usually considered to be expensive to train and very sensitive to the initialization. Here, we focus on the learning of such models by introducing a topological framework and show how it is possible to both reduce the learning complexity and produce more robust decision boundaries. We will also argue how our framework can be used for producing robust decision boundaries without exploiting the dataset bias or relying on accurate annotations.
To examine the hypothesis of this thesis, we evaluate different parts of our framework on several challenging datasets and demonstrate how our framework is capable of gradually improving the performance of image classification by collecting more robust statistics from the image and improving the quality of the descriptor.
Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2014. , 135 p.
TRITA-CSC-A, ISSN 1653-5723 ; 2014:08
Image Classification, Latent Variable Models
Computer Vision and Robotics (Autonomous Systems)
Research subject Computer Science
IdentifiersURN: urn:nbn:se:kth:diva-144896ISBN: 978-91-7595-139-3OAI: oai:DiVA.org:kth-144896DiVA: diva2:715228
2014-05-21, F3, Lindstedtsv 26, KTH, Stockholm, 09:30 (English)
Kittler, Josef, Professor
Carlsson, Stefan, Professor
QC 201405062014-05-062014-05-022014-05-06Bibliographically approved