Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bayesian Modeling of Conditional Densities
Stockholm University, Faculty of Social Sciences, Department of Statistics.
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis develops models and associated Bayesian inference methods for flexible univariate and multivariate conditional density estimation. The models are flexible in the sense that they can capture widely differing shapes of the data. The estimation methods are specifically designed to achieve flexibility while still avoiding overfitting. The models are flexible both for a given covariate value, but also across covariate space. A key contribution of this thesis is that it provides general approaches of density estimation with highly efficient Markov chain Monte Carlo methods. The methods are illustrated on several challenging non-linear and non-normal datasets.

In the first paper, a general model is proposed for flexibly estimating the density of a continuous response variable conditional on a possibly high-dimensional set of covariates. The model is a finite mixture of asymmetric student-t densities with covariate-dependent mixture weights. The four parameters of the components, the mean, degrees of freedom, scale and skewness, are all modeled as functions of the covariates. The second paper explores how well a smooth mixture of symmetric components can capture skewed data. Simulations and applications on real data show that including covariate-dependent skewness in the components can lead to substantially improved performance on skewed data, often using a much smaller number of components. We also introduce smooth mixtures of gamma and log-normal components to model positively-valued response variables. In the third paper we propose a multivariate Gaussian surface regression model that combines both additive splines and interactive splines, and a highly efficient MCMC algorithm that updates all the multi-dimensional knot locations jointly. We use shrinkage priors to avoid overfitting with different estimated shrinkage factors for the additive and surface part of the model, and also different shrinkage parameters for the different response variables. In the last paper we present a general Bayesian approach for directly modeling dependencies between variables as function of explanatory variables in a flexible copula context. In particular, the Joe-Clayton copula is extended to have covariate-dependent tail dependence and correlations. Posterior inference is carried out using a novel and efficient simulation method. The appendix of the thesis documents the computational implementation details.

Place, publisher, year, edition, pages
Stockholm: Department of Statistics, Stockholm University , 2013. , 13 p.
Keyword [en]
Bayesian inference, Density estimation, smooth mixtures, surface regression, copulas, Markov chain Monte Carlo
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
URN: urn:nbn:se:su:diva-89426ISBN: 978-91-7447-665-1 (print)OAI: oai:DiVA.org:su-89426DiVA: diva2:618973
Public defence
2013-06-10, William-Olssonsalen, Geovetenskapens hus, Svante Arrhenius väg 14, Stockholm, 10:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 3: In press. Paper 4: Manuscript.

Available from: 2013-05-16 Created: 2013-04-24 Last updated: 2013-05-06Bibliographically approved
List of papers
1. Flexible Modeling of Conditional Distributions using Smooth Mixtures of Asymmetric Student T Densities
Open this publication in new window or tab >>Flexible Modeling of Conditional Distributions using Smooth Mixtures of Asymmetric Student T Densities
2010 (English)In: Journal of Statistical Planning and Inference, ISSN 0378-3758, E-ISSN 1873-1171, Vol. 140, no 12, 3638-3654 p.Article in journal (Refereed) Published
Abstract [en]

A general model is proposed for flexibly estimating the density of a continuous response variableconditional on a possibly high-dimensional set of covariates. The model is a finite mixture ofasymmetric student-t densities with covariate-dependent mixture weights. The four parameters ofthe components, the mean, degrees of freedom, scale and skewness, are all modeled as functionsof the covariates. Inference is Bayesian and the computation is carried out using Markov chainMonte Carlo simulation. To enable model parsimony, a variable selection prior is used in each setof covariates and among the covariates in the mixing weights. The model is used to analyze thedistribution of daily stock market returns, and shown to more accurately forecast the distributionof returns than other widely used models for financial data.

Keyword
Bayesian inference, Markov Chain Monte Carlo, Mixture of Experts, Variable selection, Volatility modeling
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-43073 (URN)10.1016/j.jspi.2010.04.031 (DOI)000281982700008 ()
Available from: 2010-11-10 Created: 2010-09-27 Last updated: 2017-12-12Bibliographically approved
2. Modeling Conditional Densities using Finite Smooth Mixtures
Open this publication in new window or tab >>Modeling Conditional Densities using Finite Smooth Mixtures
2011 (English)In: Mixtures: Estimation and Applications / [ed] Kerrie L. Mengersen, Christian P. Robert, D. Michael Titterington, Chichester: John Wiley & Sons, 2011, , 20 p.123-144 p.Chapter in book (Refereed)
Abstract [en]

Smooth mixtures, i.e. mixture models with covariate-dependent mixing weights,are very useful flexible models for conditional densities. Previous work shows that using toosimple mixture components for modeling heteroscedastic and/or heavy tailed data can givea poor fit, even with a large number of components. This paper explores how well a smoothmixture of symmetric components can capture skewed data. Simulations and applications onreal data show that including covariate-dependent skewness in the components can lead tosubstantially improved performance on skewed data, often using a much smaller number ofcomponents. Furthermore, variable selection is effective in removing unnecessary covariates inthe skewness, which means that there is little loss in allowing for skewness in the componentswhen the data are actually symmetric. We also introduce smooth mixtures of gamma andlog-normal components to model positively-valued response variables.

Place, publisher, year, edition, pages
Chichester: John Wiley & Sons, 2011. 20 p.
Series
Wiley series in probability and statistics
Keyword
Bayesian inference, Markov chain Monte Carlo, Mixture of Experts, Variable selection
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-43078 (URN)10.1002/9781119995678.ch6 (DOI)978-1-119-99389-6 (ISBN)
Available from: 2010-11-10 Created: 2010-09-27 Last updated: 2013-05-02Bibliographically approved
3. Efficient Bayesian Multivariate Surface Regression
Open this publication in new window or tab >>Efficient Bayesian Multivariate Surface Regression
2013 (English)In: Scandinavian Journal of Statistics, ISSN 0303-6898, E-ISSN 1467-9469, Vol. 40, no 4, 706-723 p.Article in journal (Refereed) Published
Abstract [en]

Methods for choosing a fixed set of knot locations in additive spline models are fairly well established in the statistical literature. While most of these methods are in principle directly extendable to non-additive surface models, they are likely to be less successful in that setting because of the curse of dimensionality, especially when there are more than a couple of covariates. We propose a regression model for a multivariate Gaussian response that combines both additive splines and interactive splines, and a highly efficient MCMC algorithm that updates all the multivariate knot locations jointly. We use shrinkage priors to avoid overfitting with different estimated shrinkage factors for the additive and surface part of the model, and also different shrinkage parameters for the different response variables. This makes it possible for the model to adapt to varying degrees of nonlinearity in different parts of the data in a parsimonious way. Simulated data and an application to firm leverage data show that the approach is computationally efficient, and that allowing for freely estimated knot locations can offer a substantial improvement in out-of-sample predictive performance.

Keyword
Bayesian inference, free knots, Markov chain Monte Carlo, surface regression, splines
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-70927 (URN)10.1111/sjos.12022 (DOI)000327258100004 ()
Available from: 2012-01-25 Created: 2012-01-24 Last updated: 2017-12-08Bibliographically approved
4. Modeling Covariate-Contingent Correlation  and Tail-Dependence with Copulas
Open this publication in new window or tab >>Modeling Covariate-Contingent Correlation  and Tail-Dependence with Copulas
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Copula functions give an approach of constructing multivariate densities with  flexible combinations of distinct marginal distributions and also measures  degrees of dependence in the tail and correlations of the marginal  distributions via a novel strategy. Nevertheless common approaches of  estimating tail dependence and correlations are through nuisance parameters  which yields the final results neither tractable nor interpretable for  practitioners. In this paper we address the problem by presenting a general  Bayesian approach for directly modeling covariate-linked tail dependence and  correlations. Posterior inference is carried out using a novel and efficient  simulation method.

Keyword
Covariate-dependent copula, tail-dependence, Kendall's tau, MCMC, Bayesian variable selection
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-89506 (URN)
Available from: 2013-04-29 Created: 2013-04-29 Last updated: 2013-05-02Bibliographically approved

Open Access in DiVA

fulltext(225 kB)679 downloads
File information
File name FULLTEXT01.pdfFile size 225 kBChecksum SHA-512
ea58cb3056492c76503b628f0f7f4f003a0095f55c0360e4a8a4c303b907c48a116ff429fd3e320523ad6981940b680a5c5c283aed01a947987b8b71e919464a
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Li, Feng
By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 679 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 833 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf