Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer LearningShow others and affiliations
2022 (English)In: Proceedings of the Northern Lights Deep Learning Workshop 2022 / [ed] Sigurd Løkse, Benjamin Ricaud, Septentrio Academic Publishing , 2022, Vol. 3Conference paper, Published paper (Refereed)
Abstract [en]
Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.
Place, publisher, year, edition, pages
Septentrio Academic Publishing , 2022. Vol. 3
Series
Proceedings of the Northern Lights Deep Learning Workshop, ISSN 2703-6928
Keywords [en]
Conversational Systems, Chatbots, Dialogue, DialoGPT, Swedish
National Category
Natural Language Processing Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-90163DOI: 10.7557/18.6231OAI: oai:DiVA.org:ltu-90163DiVA, id: diva2:1651853
Conference
Northern Lights Deep Learning Conference, (NLDL 2022), Tromsø, Norway, January 10-12, 2022
2022-04-132022-04-132025-02-01Bibliographically approved