Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Decentralized Word2Vec Using Gossip Learning
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-0223-8907
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0003-4516-7317
2021 (English)In: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021), 2021Conference paper, Published paper (Refereed)
Abstract [en]

Advanced NLP models require huge amounts of data from various domains to produce high-quality representations. It is useful then for a few large public and private organizations to join their corpora during training. However, factors such as legislation and user emphasis on data privacy may prevent centralized orchestration and data sharing among these organizations. Therefore, for this specific scenario, we investigate how gossip learning, a massively-parallel, data-private, decentralized protocol, compares to a shared-dataset solution. We find that the application of Word2Vec in a gossip learning framework is viable. Without any tuning, the results are comparable to a traditional centralized setting, with a reduction in ground-truth similarity scores as low as 4.3%. Furthermore, the results are up to 54.8% better than independent local training.

Place, publisher, year, edition, pages
2021.
National Category
Computer Sciences
Research subject
Computer Science; Information and Communication Technology
Identifiers
URN: urn:nbn:se:kth:diva-292658OAI: oai:DiVA.org:kth-292658DiVA, id: diva2:1543377
Conference
23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)
Funder
EU, Horizon 2020, 813162
Note

QC 20210423

Available from: 2021-04-11 Created: 2021-04-11 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

fulltext(2926 kB)142 downloads
File information
File name FULLTEXT01.pdfFile size 2926 kBChecksum SHA-512
12bed87fee4a84b23b56a385cd3867190355f63ce838a1cf01dc7952713c4b23f067fe9abaa5f0d19dc78944fc2ebe04a4db1dab21070804f2b53020d6b6fcbf
Type fulltextMimetype application/pdf

Other links

Conference

Search in DiVA

By author/editor
Alkathiri, Abdul AzizGiaretta, LodovicoGirdzijauskas, Sarunas
By organisation
School of Electrical Engineering and Computer Science (EECS)Software and Computer systems, SCS
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 142 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 450 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf