Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
FedQAS: Privacy-Aware Machine Reading Comprehension with Federated Learning
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.ORCID iD: 0000-0002-6309-2892
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.ORCID iD: 0000-0003-0302-6276
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.ORCID iD: 0000-0001-7273-7923
2022 (English)In: Applied Sciences, E-ISSN 2076-3417, Vol. 12, no 6, article id 3130Article in journal (Refereed) Published
Abstract [en]

Machine reading comprehension (MRC) of text data is a challenging task in Natural Language Processing (NLP), with a lot of ongoing research fueled by the release of the Stanford Question Answering Dataset (SQuAD) and Conversational Question Answering (CoQA). It is considered to be an effort to teach computers how to "understand" a text, and then to be able to answer questions about it using deep learning. However, until now, large-scale training on private text data and knowledge sharing has been missing for this NLP task. Hence, we present FedQAS, a privacy-preserving machine reading system capable of leveraging large-scale private data without the need to pool those datasets in a central location. The proposed approach combines transformer models and federated learning technologies. The system is developed using the FEDn framework and deployed as a proof-of-concept alliance initiative. FedQAS is flexible, language-agnostic, and allows intuitive participation and execution of local model training. In addition, we present the architecture and implementation of the system, as well as provide a reference evaluation based on the SQuAD dataset, to showcase how it overcomes data privacy issues and enables knowledge sharing between alliance members in a Federated learning setting.

Place, publisher, year, edition, pages
MDPI , 2022. Vol. 12, no 6, article id 3130
Keywords [en]
machine reading comprehension, natural language processing, question answering, data privacy, federated learning, transformer
National Category
Language Technology (Computational Linguistics)
Research subject
Scientific Computing; Computational Linguistics
Identifiers
URN: urn:nbn:se:uu:diva-472746DOI: 10.3390/app12063130ISI: 000776187800001OAI: oai:DiVA.org:uu-472746DiVA, id: diva2:1652615
Projects
eSSENCE - An eScience CollaborationAvailable from: 2022-04-19 Created: 2022-04-19 Last updated: 2023-01-12Bibliographically approved

Open Access in DiVA

fulltext(608 kB)341 downloads
File information
File name FULLTEXT01.pdfFile size 608 kBChecksum SHA-512
3a8d07077ce9364ff215dc99ec87259fa5a2fe5a74bd15bf376222912b1dbd51c8d74bffedcf1afc3c3cd417bc6d1bc431c5ae203ad607aed61024417a57b71c
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Ait-Mlouk, AddiAlawadi, SadiToor, SalmanHellander, Andreas
By organisation
Division of Scientific ComputingComputational Science
In the same journal
Applied Sciences
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 342 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 203 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf