Exploring Inhibitor Attention and Model Compression for Sustainable Language Models
2025 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE credits
Student thesis
Abstract [en]
This thesis investigates the optimization of transformer-based language models through an innovative attention mechanism called inhibitor attention, combined with model compression techniques. The thesis implements an alternative attention mechanism that replaces traditional dot product attention with a manhattan distance-based approach and \gls{relu} activation, examining both theoretical efficiency gains and practical implementation challenges.Through experiments combining this mechanism with knowledge distillation and quantization techniques, we evaluated the effectiveness of these methods on DistilBERT models. The results of from the GLUE benchmark suite show that the fine-tuned inhibitor model achieves competitive performance, scoring 74.5 compared to 77.0 for the traditional dot product model. In the IMDB sentiment analysis task, the inhibitor DistilBERT maintained a precision comparable (92. 81\%) to that of the standard DistilBERT (92. 82\%).While theoretical analysis through GEM5 simulations suggested potential energy savings with inhibitor attention, practical measurements on a CPU revealed contradictory results. The inhibitor model showed higher energy consumption (2011J vs. 1176J at sequence length 128) and lower throughput compared to traditional attention, highlighting the impact of current hardware optimizations on real-world performance. These findings demonstrate that, while inhibitor attention shows promise for developing more efficient transformer models, realizing its potential may require specialized hardware solutions and optimized implementations.
Place, publisher, year, edition, pages
2025. , p. 66
Keywords [en]
AI, Machine Learning, Knowledge Distillation
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:ltu:diva-112306OAI: oai:DiVA.org:ltu-112306DiVA, id: diva2:1950490
External cooperation
Rise (Research Institutes of Sweden)
Educational program
Computer Science and Engineering, master's level
Presentation
2024-11-20, A2527, Luleå, 14:00 (English)
Supervisors
Examiners
2025-04-082025-04-082025-04-08Bibliographically approved