Context: Software engineering (SE) artifacts and documents, such as requirements specifications, user stories, test cases, and concepts of operations (ConOps), are typically written in natural language, making their manipulation challenging. Natural Language Processing (NLP) is a viable solution for managing these tasks.
Objective: To conduct a systematic literature review to explore the current use of NLP in SE artifacts and tasks, supplementedby a tertiary study focusing on the emerging role of Large Language Models (LLMs) in software engineering re-search.
Method: We searched digital libraries for relevant papers and applied inclusion and exclusion criteria to filter the primary studies. We then analyzed NLP techniques applied to SE documents and examined their usage in this context. Our research methodology followed Kitchenham and Charters’ guidelines. Additionally, we conducted a tertiary study to synthesize findings from existing systematic literature reviews and surveys specifically addressing LLMs in software engineering.
Results: We selected 60 primary studies to identify the most common methods for NLP pipelines, feature extraction, language models, and machine learning algorithms used in SE. We also assessed the purposes of these methods, their benefits for SE, their difficulty, and their contribution to SE advancement. The tertiary study revealed a rapid proliferation of LLM-focused research, with comprehensive reviews documenting exponential growth in publications and widespread adoption across diverse SE tasks.
Conclusion: Requirements are the most frequently addressed artifacts using NLP techniques, with preprocessing and part-of-speech (POS) tagging being widely used. There is a notable increase in the use of large language models for various SE tasks, such as requirements elicitation, source code generation, bug fixing, and software testing. The tertiary study confirms that LLMs represent a pivotal shift in the research landscape, warranting dedicated investigation to understand their transformative impact on NLP applications in software engineering.
Sociedad Brasileira de Computacao , 2025. Vol. 13, no 2
Natural Language Processing, Software Engineering, Machine Learning, Literature Review