Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Securing Large LanguageModels Against Membership Inference Attacks
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 80 credits / 120 HE creditsStudent thesis
Abstract [en]

Large Language Models (LLMs) have transformed natural language processing by producingcoherent and contextually appropriate text. However, their broad adoption brings significantprivacy and security issues, especially concerning the potential for sensitive or personallyidentifiable information to be inferred from the model’s outputs. A notable risk in this context isposed by Membership Inference Attacks (MIAs).This thesis investigates the privacy challenges associated with fine-tuning LLMs, focusing onhow fine-tuned models might retain and reveal memorized information from their training data.The research aims to develop secure fine-tuning techniques to create robust language modelsthat can mitigate the privacy risks of MIAs. One key approach examined is Differential Privacy(DP), which ensures that the inclusion or exclusion of a single data record minimally impacts themodel’s output, thereby safeguarding individual privacy.Utilizing the GPT-2 model and the SPEC5G dataset, this thesis fine-tunes models for aquestion-answering application (Chatbot). Through empirical evaluations and experiments onbenchmark datasets, we evaluate the effectiveness of differential privacy in protecting againstMIAs. The study addresses the balance between privacy protection and model performance,aiming to identify practical challenges and enhance DP implementation in large-scale languagemodels.The results demonstrate that while Differential Privacy can significantly reduce the risk of MIAs,it often comes at the cost of reduced model accuracy. However, by finetuning the privacyparameters and employing DP techniques, this thesis successfully strikes a balance, achievingsubstantial privacy protection with minimal impact on model performance. These findingscontribute to the development of robust and privacypreserving natural language processingsystems, addressing the increasing concerns over data privacy in the deployment of LLMs.

Place, publisher, year, edition, pages
2024. , p. 58
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:uu:diva-537947OAI: oai:DiVA.org:uu-537947DiVA, id: diva2:1895805
External cooperation
Ericsson R&D Security
Educational program
Master's Programme in Data Science
Presentation
, Uppsala (English)
Supervisors
Examiners
Available from: 2024-09-09 Created: 2024-09-06 Last updated: 2024-09-09Bibliographically approved

Open Access in DiVA

Pennas_thesis(1414 kB)201 downloads
File information
File name FULLTEXT01.pdfFile size 1414 kBChecksum SHA-512
3920f6ca0d76e8b8264776a2f0fc095f9e44aa9b4c93340ed0f5a15ad4036e0316dd52659b285e31d54816604418cd4675ee625a85ff466aa2bbb7a1595e97f1
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 201 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 368 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf