Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Data Driven Visual Recognition
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. (Computer Vision Group)
2014 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems.

In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them.

In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model.

We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate.

We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven.

sted, utgiver, år, opplag, sider
KTH Royal Institute of Technology, 2014. , s. ix, 36
Emneord [en]
Visual Recognition, Data Driven, Supervised Learning, Mixture Models, Non-Parametric Models, Category Recognition, Novelty Detection
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-145865ISBN: 978-91-7595-197-3 (tryckt)OAI: oai:DiVA.org:kth-145865DiVA, id: diva2:720768
Disputas
2014-06-12, F3, Lindstedtsvägen 26, KTH, Stockholm, 14:00 (engelsk)
Opponent
Veileder
Merknad

QC 20140604

Tilgjengelig fra: 2014-06-04 Laget: 2014-06-02 Sist oppdatert: 2018-01-11bibliografisk kontrollert
Delarbeid
1. Novelty Detection from an Ego-Centric perspective
Åpne denne publikasjonen i ny fane eller vindu >>Novelty Detection from an Ego-Centric perspective
2011 (engelsk)Inngår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011, s. 3297-3304Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

This paper demonstrates a system for the automatic extraction of novelty in images captured from a small video camera attached to a subject's chest, replicating his visual perspective, while performing activities which are repeated daily. Novelty is detected when a (sub)sequence cannot be registered to previously stored sequences captured while performing the same daily activity. Sequence registration is performed by measuring appearance and geometric similarity of individual frames and exploiting the invariant temporal order of the activity. Experimental results demonstrate that this is a robust way to detect novelties induced by variations in the wearer's ego-motion such as stopping and talking to a person. This is an essentially new and generic way of automatically extracting information of interest to the camera wearer and can be used as input to a system for life logging or memory support.

Serie
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-38873 (URN)10.1109/CVPR.2011.5995731 (DOI)000295615803073 ()2-s2.0-80052890189 (Scopus ID)978-145770394-2 (ISBN)
Konferanse
2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011; Colorado Springs, CO; 20 June 2011 through 25 June 2011
Prosjekter
VINST
Forskningsfinansiär
ICT - The Next Generation
Merknad
QC 20111012Tilgjengelig fra: 2011-09-01 Laget: 2011-09-01 Sist oppdatert: 2014-06-04bibliografisk kontrollert
2. Multi view registration for novelty/background separation
Åpne denne publikasjonen i ny fane eller vindu >>Multi view registration for novelty/background separation
2012 (engelsk)Inngår i: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE Computer Society, 2012, s. 757-764Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

We propose a system for the automatic segmentation of novelties from the background in scenarios where multiple images of the same environment are available e.g. obtained by wearable visual cameras. Our method finds the pixels in a query image corresponding to the underlying background environment by comparing it to reference images of the same scene. This is achieved despite the fact that all the images may have different viewpoints, significantly different illumination conditions and contain different objects cars, people, bicycles, etc. occluding the background. We estimate the probability of each pixel, in the query image, belonging to the background by computing its appearance inconsistency to the multiple reference images. We then, produce multiple segmentations of the query image using an iterated graph cuts algorithm, initializing from these estimated probabilities and consecutively combine these segmentations to come up with a final segmentation of the background. Detection of the background in turn highlights the novel pixels. We demonstrate the effectiveness of our approach on a challenging outdoors data set.

sted, utgiver, år, opplag, sider
IEEE Computer Society, 2012
Serie
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
Emneord
Automatic segmentations, Background environment, Data sets, Graph cut, Illumination conditions, Multi-view registration, Multiple image, Multiple reference images, Multiple segmentation, Query images, Reference image, Computer vision, Pixels, Image segmentation
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-105314 (URN)10.1109/CVPR.2012.6247746 (DOI)000309166200095 ()2-s2.0-84866662308 (Scopus ID)978-146731226-4 (ISBN)
Konferanse
2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012, 16 June 2012 through 21 June 2012, Providence, RI
Forskningsfinansiär
ICT - The Next Generation
Merknad

QC 20121121

Tilgjengelig fra: 2012-11-21 Laget: 2012-11-20 Sist oppdatert: 2018-01-12bibliografisk kontrollert
3. Mixture component identification and learning for visual recognition
Åpne denne publikasjonen i ny fane eller vindu >>Mixture component identification and learning for visual recognition
2012 (engelsk)Inngår i: Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part VI, Springer, 2012, s. 115-128Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

The non-linear decision boundary between object and background classes - due to large intra-class variations - needs to be modelled by any classifier wishing to achieve good results. While a mixture of linear classifiers is capable of modelling this non-linearity, learning this mixture from weakly annotated data is non-trivial and is the paper's focus. Our approach is to identify the modes in the distribution of our positive examples by clustering, and to utilize this clustering in a latent SVM formulation to learn the mixture model. The clustering relies on a robust measure of visual similarity which suppresses uninformative clutter by using a novel representation based on the exemplar SVM. This subtle clustering of the data leads to learning better mixture models, as is demonstrated via extensive evaluations on Pascal VOC 2007. The final classifier, using a HOG representation of the global image patch, achieves performance comparable to the state-of-the-art while being more efficient at detection time.

sted, utgiver, år, opplag, sider
Springer, 2012
Serie
Lecture Notes in Computer Science, ISSN 0302-9743 ; 7577
Emneord
Decision boundary, Detection time, Image patches, Intra-class variation, Linear classifiers, Mixture components, Mixture model, Non-Linearity, Non-trivial, Positive examples, Visual recognition, Visual similarity, Weakly annotated data
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-106987 (URN)10.1007/978-3-642-33783-3_9 (DOI)000342828800009 ()2-s2.0-84867892975 (Scopus ID)978-364233782-6 (ISBN)
Konferanse
12th European Conference on Computer Vision, ECCV 2012;Florence;7 October 2012 through 13 October 2012
Forskningsfinansiär
ICT - The Next Generation
Merknad

QC 20121207

Tilgjengelig fra: 2012-12-05 Laget: 2012-12-05 Sist oppdatert: 2018-01-12bibliografisk kontrollert
4. Properties of Datasets Predict the Performance of Classifiers
Åpne denne publikasjonen i ny fane eller vindu >>Properties of Datasets Predict the Performance of Classifiers
2013 (engelsk)Manuskript (preprint) (Annet vitenskapelig)
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-145982 (URN)
Merknad

QS 2014

Tilgjengelig fra: 2014-06-04 Laget: 2014-06-04 Sist oppdatert: 2018-01-11bibliografisk kontrollert
5. Large Scale, Large Margin Classification using Indefinite Similarity Measurens
Åpne denne publikasjonen i ny fane eller vindu >>Large Scale, Large Margin Classification using Indefinite Similarity Measurens
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-145979 (URN)
Merknad

QS 2014

Tilgjengelig fra: 2014-06-04 Laget: 2014-06-04 Sist oppdatert: 2018-01-11bibliografisk kontrollert

Open Access i DiVA

Thesis(6092 kB)419 nedlastinger
Filinformasjon
Fil FULLTEXT02.pdfFilstørrelse 6092 kBChecksum SHA-512
506e40a42331d74209897ac0ee69c0d51a51ed61bf4d272b5535c738294aa62cb062aa885ec109a62a2e5225d738ba8bbebe26d0751f48495a3e202953717b75
Type fulltextMimetype application/pdf

Andre lenker

http://www.csc.kth.se/~omida/PhD_thesis.pdf

Søk i DiVA

Av forfatter/redaktør
Aghazadeh, Omid
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 419 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 601 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf