Digitala Vetenskapliga Arkivet

Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Contribution Prediction in Federated Learning via Client Behavior Evaluation
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap. (AIDA)ORCID-id: 0000-0001-6061-0861
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0003-3128-191x
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0002-3118-5058
2025 (Engelska)Ingår i: Future Generation Computer Systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 166, artikel-id 107639Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Federated learning (FL), a decentralized machine learning framework that allows edge devices (i.e., clients) to train a global model while preserving data/client privacy, has become increasingly popular recently. In FL, a shared global model is built by aggregating the updated parameters in a distributed manner. To incentivize data owners to participate in FL, it is essential for service providers to fairly evaluate the contribution of each data owner to the shared model during the learning process. To the best of our knowledge, most existing solutions are resource-demanding and usually run as an additional evaluation procedure. The latter produces an expensive computational cost for large data owners. In this paper, we present simple and effective FL solutions that show how the clients’ behavior can be evaluated during the training process with respect to reliability, and this is demonstrated for two existing FL models, Cluster Analysis-based Federated Learning (CA-FL) and Group-Personalized FL (GP-FL), respectively. In the former model, CA-FL, the frequency of each client to be selected as a cluster representative and in that way to be involved in the building of the shared model is assessed. This can eventually be considered as a measure of the respective client data reliability. In the latter model, GP-FL, we calculate how many times each client changes a cluster it belongs to during FL training, which can be interpreted as a measure of the client's unstable behavior, i.e., it can be considered as not very reliable. We validate our FL approaches on three LEAF datasets and benchmark their performance to two baseline contribution evaluation approaches. The experimental results demonstrate that by applying the two FL models we are able to get robust evaluations of clients’ behavior during the training process. These evaluations can be used for further studying, comparing, understanding, and eventually predicting clients’ contributions to the shared global model.

Ort, förlag, år, upplaga, sidor
Elsevier, 2025. Vol. 166, artikel-id 107639
Nyckelord [en]
Behavior monitoring; Clustering analysis, Contribution evaluation, Eccentricity analysis, Federated learning
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:bth-26080DOI: 10.1016/j.future.2024.107639ISI: 001407806400001Scopus ID: 2-s2.0-85211047272OAI: oai:DiVA.org:bth-26080DiVA, id: diva2:1849017
Ingår i projekt
HINTS – Intelligenta verkligheter med människan i centrumSERT- Software Engineering ReThought, KK-stiftelsen
Forskningsfinansiär
KK-stiftelsen, 20220068KK-stiftelsen, 20180010Tillgänglig från: 2024-04-05 Skapad: 2024-04-05 Senast uppdaterad: 2025-09-30Bibliografiskt granskad
Ingår i avhandling
1. Resource-Aware and Personalized Federated Learning via Clustering Analysis
Öppna denna publikation i ny flik eller fönster >>Resource-Aware and Personalized Federated Learning via Clustering Analysis
2024 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Today’s advancement in Artificial Intelligence (AI) enables training Machine Learning (ML) models on the daily-produced data by connected edge devices. To make the most of the data stored on the device, conventional ML approaches require gathering all individual data sets and transferring them to a central location to train a common model. However, centralizing data incurs significant costs related to communication, network resource utilization, high volume of traffic, and privacy issues. To address the aforementioned challenges, Federated Learning (FL) is employed as a novel approach to train a shared model on decentralized edge devices while preserving privacy. Despite the significant potential of FL, it still requires considerable resources such as time, computational power, energy, and bandwidth availability. More importantly, the computational capabilities of the training devices may vary over time. Furthermore, the devices involved in the training process of FL may have distinct training datasets that differ in terms of their size and distribution. As a result of this, the convergence of the FL models may become unstable and slow. These differences can influence the FL process and ultimately lead to suboptimal model performance within a heterogeneous federated network.

In this thesis, we have tackled several of the aforementioned challenges. Initially, a FL algorithm is proposed that utilizes cluster analysis to address the problem of communication overhead. This issue poses a major bottleneck in FL, particularly for complex models, large-scale applications, and frequent updates. The next research conducted in this thesis involved extending the previous study to include wireless networks (WNs). In WSNs, achieving energy-efficient transmission is a significant challenge due to their limited resources. This has motivated us to continue with a comprehensive overview and classification of the latest advancements in context-aware edge-based AI models, with a specific emphasis on sensor networks. The review has also investigated the associated challenges and motivations for adopting AI techniques, along with an evaluation of current areas of research that need further investigation. To optimize the aggregation of the FL model and alleviate communication expenses, the initial study addressing communication overhead is extended to include a FL-based cluster optimization approach. Furthermore, to reduce the detrimental effect caused by data heterogeneity among edge devices on FL, a new study of group-personalized FL models has been conducted. Finally, taking inspiration from the previously mentioned FL models, techniques for assessing clients' contribution by monitoring and evaluating their behavior during training are proposed. In comparison with the most existing contribution evaluation solutions, the proposed techniques do not require significant computational resources.

The FL algorithms presented in this thesis are assessed on a range of real-world datasets. The extensive experiments demonstrated that the proposed FL techniques are effective and robust. These techniques improve communication efficiency, resource utilization, model convergence speed, and aggregation efficiency, and also reduce data heterogeneity when compared to other state-of-the-art methods.

Ort, förlag, år, upplaga, sidor
Karlskrona: Blekinge Tekniska Högskola, 2024. s. 260
Serie
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2024:04
Nyckelord
Federated Learning, Clustering Analysis, Eccentricity Analysis, Non- IID Data, Model Personalization
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Datavetenskap
Identifikatorer
urn:nbn:se:bth-26081 (URN)978-91-7295-478-6 (ISBN)
Disputation
2024-05-17, C413A, Karlskrona, 10:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2024-04-05 Skapad: 2024-04-05 Senast uppdaterad: 2025-09-30Bibliografiskt granskad

Open Access i DiVA

fulltext(2566 kB)134 nedladdningar
Filinformation
Filnamn FULLTEXT02.pdfFilstorlek 2566 kBChecksumma SHA-512
c1f24563fbc795a359d117e2d4c5acda69dedec9ecb229eb3e82e6863bf91a199b6d892cd3a3d55e2d3f379f2774f5b03824f7aaed93c7ab1e796d1ba5a74c78
Typ fulltextMimetyp application/pdf

Övriga länkar

Förlagets fulltextScopus

Sök vidare i DiVA

Av författaren/redaktören
Al-Saedi, Ahmed Abbas MohsinBoeva, VeselkaCasalicchio, Emiliano
Av organisationen
Institutionen för datavetenskap
I samma tidskrift
Future Generation Computer Systems
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 134 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 856 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf