Open this publication in new window or tab >>2026 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
Generative Artificial Intelligence (GenAI) is being rapidly adopted in software engineering, introducing a paradigm shift toward human-AI co-creation. However, the non-deterministic, probabilistic, and often black-box nature of GenAI modelspresents challenges for traditional software quality assurance. Conventional verification and validation techniques are insufficient to handle outputs that are neither predictably correct nor incorrect, but rather stochastically plausible. This discrepancy creates an urgent need for practical processes, metrics, and new governance frameworks to evaluate and manage the quality of GenAI systems in industrial environments.This thesis examines how industrial organizations adopt GenAI, identify metrics, and evaluate system qualities in alignment with ISO quality standards. Case studies were employed to explore real-world adoption processes, identify context-specific industrial metrics, and uncover practical insights within organizations. A snowballing literature review was conducted to systematically identify, categorize, and synthesize academic metrics for evaluating the output of GenAI systems. Finally, a controlled experiment was designed to quantitatively test the efficiency (e.g., E2E generation time) and effectiveness (e.g., accuracy) of GenAI agent choices. The main contributions of this thesis are a synthesized actionable model and framework grounded in both industrial practice and quality standards. The first contribution is a four-stage adoption model, denoted as the IMRM model (Innovate → considerations, Measure → metrics, Realize → values, Manage → improvements) that integrates early-stage risk assessment (e.g., legal, security, and licensing) andquality evaluation throughout the GenAI adoption and usage.The second contribution presents a detailed framework that connects risks andmetrics to concrete decision support, justifying the business value (e.g., quality gates) and technical trade-offs of GenAI solutions. The third contribution provides a structured mapping of GenAI quality to ISO/IEC 25010, 25023, and 25059 characteristics, attempting to ground practical evaluation needs within a standardized vocabulary. This thesis concludes that a structured quality evaluation process, which prioritizes risks and context, is a valuable approach intended to support building the business confidence required to leverage GenAI for efficient and effective software engineering in industry.
Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2026. p. 232
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2026:01
Keywords
Quality Evaluation, Metrics, Artificial Intelligence, AI, Generative AI, Empirical Software Engineering
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-28958 (URN)
Public defence
2026-01-29, J1630, Karlskrona, 11:48 (English)
Opponent
Supervisors
2025-12-082025-12-032025-12-11Bibliographically approved