Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
DOLDA: a regularized supervised topic model for high-dimensional multi-class regression
Linköpings universitet, Institutionen för datavetenskap, Statistik och maskininlärning. Linköpings universitet, Filosofiska fakulteten. Aalto University, Espoo, Finland.
Ericsson AB, Stockholm, Sweden.
Linköpings universitet, Institutionen för datavetenskap, Statistik och maskininlärning. Linköpings universitet, Filosofiska fakulteten. Stockholm University, Stockholm, Sweden.
2019 (engelsk)Inngår i: Computational statistics (Zeitschrift), ISSN 0943-4062, E-ISSN 1613-9658Artikkel i tidsskrift (Fagfellevurdert) Epub ahead of print
Abstract [en]

Generating user interpretable multi-class predictions in data-rich environments with many classes and explanatory covariates is a daunting task. We introduce Diagonal Orthant Latent Dirichlet Allocation (DOLDA), a supervised topic model for multi-class classification that can handle many classes as well as many covariates. To handle many classes we use the recently proposed Diagonal Orthant probit model (Johndrow et al., in: Proceedings of the sixteenth international conference on artificial intelligence and statistics, 2013) together with an efficient Horseshoe prior for variable selection/shrinkage (Carvalho et al. in Biometrika 97:465–480, 2010). We propose a computationally efficient parallel Gibbs sampler for the new model. An important advantage of DOLDA is that learned topics are directly connected to individual classes without the need for a reference class. We evaluate the model’s predictive accuracy and scalability, and demonstrate DOLDA’s advantage in interpreting the generated predictions.

sted, utgiver, år, opplag, sider
Springer, 2019.
Emneord [en]
Text classification, Latent Dirichlet Allocation, Horseshoe prior, Diagonal Orthant probit model, Interpretable models
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-159217DOI: 10.1007/s00180-019-00891-1Scopus ID: 2-s2.0-85067414496OAI: oai:DiVA.org:liu-159217DiVA, id: diva2:1340533
Tilgjengelig fra: 2019-08-05 Laget: 2019-08-05 Sist oppdatert: 2019-11-14bibliografisk kontrollert

Open Access i DiVA

fulltext(1158 kB)17 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 1158 kBChecksum SHA-512
84eb60b070b1b1cd1c2a263550882d2fae3129affcd5c0e8bac9c1c0e6119f16d74ab725fc8d4077ca49b4ef2314934029e0a20b08311ee301ca939dd56c7734
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Søk i DiVA

Av forfatter/redaktør
Magnusson, MånsVillani, Mattias
Av organisasjonen
I samme tidsskrift
Computational statistics (Zeitschrift)

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 17 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 33 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf