Over the past two centuries the industrial revolution automated a great part of work that involved human muscles. Recently, since the beginning of the 21st century, the focus has shifted towards automating work that is involving our brain to further improve our lives. This is accomplished by establishing human-level intelligence through machines, which lead to the growth of the field of artificial intelligence. Machine learning is a core component of artificial intelligence. While artificial intelligence focuses on constructing an entire intelligence system, machine learning focuses on the learning ability and the ability to further use the learned knowledge for different tasks. This thesis targets the field of machine learning, especially structured representation learning, which is key for various machine learning approaches.
Humans sense the environment, extract information and make action decisions based on abstracted information. Similarly, machines receive data, abstract information from data through models and make decisions about the unknown through inference. Thus, models provide a mechanism for machines to abstract information. This commonly involves learning useful representations which are desirably compact, interpretable and useful for different tasks. In this thesis, the contribution relates to the design of efficient representation models with latent variables. To make the models useful, efficient inference algorithms are derived to fit the models to data. We apply our models to various applications from different domains, namely E-health, robotics, text mining, computer vision and recommendation systems.
The main contribution of this thesis relates to advancing latent variable models and deriving associated inference schemes for representation learning. This is pursued in three different directions. Firstly, through supervised models, where better representations can be learned knowing the tasks, corresponding to situated knowledge of humans. Secondly, through structured representation models, with which different structures, such as factorized ones, are used for latent variable models to form more efficient representations. Finally, through non-parametric models, where the representation is determined completely by the data. Specifically, we propose several new models combining supervised learning and factorized representation as well as a further model combining non-parametric modeling and supervised approaches. Evaluations show that these new models provide generally more efficient representations and a higher degree of interpretability.
Moreover, this thesis contributes by applying these proposed models in different practical scenarios, demonstrating that these models can provide efficient latent representations. Experimental results show that our models improve the performance for classical tasks, such as image classification and annotations, robotic scene and action understanding. Most notably, one of our models is applied to a novel problem in E-health, namely diagnostic prediction using discomfort drawings. Experimental investigation show here that our model can achieve significant results in automatic diagnosing and provides profound understanding of typical symptoms. This motivates novel decision support systems for healthcare personnel.
Stockholm: KTH Royal Institute of Technology, 2016. , 245 p.