Digitala Vetenskapliga Arkivet

Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Code Review as a Communication Network
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för programvaruteknik.ORCID-id: 0000-0001-8879-6450
2025 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Background: Modern software systems are often too large and complex for an individual developer to fully oversee, making it difficult to understand the implications of changes. Therefore, most collaborative software projects rely on code review as communication network to foster asynchronous discussions about changes before they are merged. Although prior qualitative studies have revealed that practitioners view code review as a communication network, no formal theory or empirical validation exists. Without formalization and confirmatory evidence, the theory remains uncertain, limiting its credibility, practical relevance, and future development.

Objective: In this thesis, our objective is to (1) formalize the theory of code review as a communication network, (2) empirically evaluate the theory across varied perspectives, contexts, and conditions by quantifying the capability of code review to diffuse information among its participants, (3) demonstrate its practical relevance by applying the theory to the domain of tax compliance in collaborative software engineering, and (4) examine how the role of code review as a communication network for collaborative software engineering may evolve in the future.

Methods: To formalize the theory of code review as a communication network, we developed and validated a simulation model that operationalizes its core propositions about information diffusion among participants. To empirically evaluate the theory, we employed two complementary research approaches. First, we used the simulation model to conduct in silico experiments with closed-source code review systems from Microsoft, Spotify, and Trivago, as well as open-source code review systems from Android, Visual Studio Code, and React, to estimate the upper bound of information diffusion in code review. Second, through an observational study, we quantified the diffusion of information in code review across social, organizational, and architectural boundaries at Spotify. To demonstrate the practical relevance of the theory, we analyzed the code review system of a multinational enterprise as a communication network to reveal the latent collaboration structure among developers across borders, which is taxable. To explore the future of code review as a communication network, we conducted a questionnaire survey with 92 practitioners to gather their expectations and discuss how these anticipated changes may reshape our understanding of code review.

Results: By formalizing the theory of code review as a communication network modelled as a time-varying hypergraph, we were able to empirically demonstrate that traditional time-agnostic models substantially overestimate information diffusion in code review. Throughout our empirical studies, we found substential evidence supporting the theory of code review as a communication network: We confirmed that code review is capable of diffusing information quickly and widely among participants, even at a large scale. We also observed extensive information diffusion across social, organizational, and architectural boundaries at Spotify corroborating our theory. However, we also found that information diffusion patterns in open-source code review systems differ significantly, suggesting that findings from open-source environments may not directly apply to closed-source contexts. Through applying the theory of code review as a communication network in the domain of tax compliance, we were able to uncover the significant and previously unrecognized tax risks associated with collaborative software engineering within multinational enterprises. While practitioners consider code review also in the future a core practice in collaborative software engineering, we identify a potential risk that generative AI may undermine code review’s role as a human communication network.

Conclusion: Our work on understanding code review as a communication network contributes not only to theory-driven, empirical software engineering research but also lays the groundwork for practical applications, particularly in the context of tax compliance. Future research is needed to explore the evolving role of code review as a communication network.

sted, utgiver, år, opplag, sider
Karlskrona, Sweden: Blekinge Tekniska Högskola, 2025. , s. 188
Serie
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2025:10
Emneord [en]
code review, software engineering, tax compliance, collaborative software engineering, communication network
HSV kategori
Forskningsprogram
Programvaruteknik
Identifikatorer
URN: urn:nbn:se:bth-28424ISBN: 978-91-7295-508-0 (tryckt)OAI: oai:DiVA.org:bth-28424DiVA, id: diva2:1991183
Disputas
2025-09-23, J1630, Valhallavägen 1, Karlskrona, 14:00 (engelsk)
Opponent
Veileder
Ingår i projekt
SERT- Software Engineering ReThought, Knowledge FoundationTilgjengelig fra: 2025-08-22 Laget: 2025-08-22 Sist oppdatert: 2025-09-30bibliografisk kontrollert
Delarbeid
1. Only Time Will Tell: Modelling Information Diffusion in Code Review with Time-Varying Hypergraphs
Åpne denne publikasjonen i ny fane eller vindu >>Only Time Will Tell: Modelling Information Diffusion in Code Review with Time-Varying Hypergraphs
Vise andre…
2022 (engelsk)Inngår i: ESEM '22: Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement / [ed] Madeiral F., Lassenius C., Lassenius C., Conte T., Mannisto T., Association for Computing Machinery (ACM), 2022, s. 195-204Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Background: Modern code review is expected to facilitate knowledge sharing: All relevant information, the collective expertise, and meta-information around the code change and its context become evident, transparent, and explicit in the corresponding code review discussion. The discussion participants can leverage this information in the following code reviews; the information diffuses through the communication network that emerges from code review. Traditional time-aggregated graphs fall short in rendering information diffusion as those models ignore the temporal order of the information exchange: Information can only be passed on if it is available in the first place.

Aim: This manuscript presents a novel model based on time-varying hypergraphs for rendering information diffusion that overcomes the inherent limitations of traditional, time-aggregated graph-based models. 

Method: In an in-silico experiment, we simulate an information diffusion within the internal code review at Microsoft and show the empirical impact of time on a key characteristic of information diffusion: the number of reachable participants. 

Results: Time-aggregation significantly overestimates the paths of information diffusion available in communication networks and, thus, is neither precise nor accurate for modelling and measuring the spread of information within communication networks that emerge from code review. 

Conclusion: Our model overcomes the inherent limitations of traditional, static or time-aggregated, graph-based communication models and sheds the first light on information diffusion through code review. We believe that our model can serve as a foundation for understanding, measuring, managing, and improving knowledge sharing in code review in particular and information diffusion in software engineering in general.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2022
Serie
International Symposium on Empirical Software Engineering and Measurement, ISSN 1949-3770, E-ISSN 1949-3789
Emneord
code review, collaboration, communication, communication network, developer networks, in-silico experiment, information diffusion, knowledge sharing, measurement, simulation, time-varying hypergraph, topology
HSV kategori
Forskningsprogram
Programvaruteknik
Identifikatorer
urn:nbn:se:bth-23480 (URN)10.1145/3544902.3546254 (DOI)001139214400018 ()2-s2.0-85139871479 (Scopus ID)9781450394277 (ISBN)
Konferanse
16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022, Helsinki, 18 September through 23 September 2022
Forskningsfinansiär
Knowledge Foundation, 20180010
Merknad

open access

Tilgjengelig fra: 2022-08-05 Laget: 2022-08-05 Sist oppdatert: 2025-09-30bibliografisk kontrollert
2. The upper bound of information diffusion in code review
Åpne denne publikasjonen i ny fane eller vindu >>The upper bound of information diffusion in code review
Vise andre…
2025 (engelsk)Inngår i: Empirical Software Engineering, ISSN 1382-3256, E-ISSN 1573-7616, Vol. 30, nr 1, artikkel-id 2Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Background

Code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although reported by qualitative studies, our understanding of the capability of code review as a communication network is still limited.

Objective

In this article, we report on a first step towards understanding and evaluating the capability of code review as a communication network by quantifying how fast and how far information can spread through code review: the upper bound of information diffusion in code review.

Method

In an in-silico experiment, we simulate an artificial information diffusion within large (Microsoft), mid-sized (Spotify), and small code review systems (Trivago) modelled as communication networks. We then measure the minimal topological and temporal distances between the participants to quantify how far and how fast information can spread in code review.

Results

An average code review participants in the small and mid-sized code review systems can spread information to between 72 % and 85 % of all code review participants within four weeks independently of network size and tooling; for the large code review systems, we found an absolute boundary of about 11 000 reachable participants. On average (median), information can spread between two participants in code review in less than five hops and less than five days.

Conclusion

We found evidence that the communication network emerging from code review scales well and spreads information fast and broadly, corroborating the findings of prior qualitative work. The study lays the foundation for understanding and improving code review as a communication network.

sted, utgiver, år, opplag, sider
Springer, 2025
Emneord
Code review, Simulation, Information diffusion, Communication network
HSV kategori
Identifikatorer
urn:nbn:se:bth-27028 (URN)10.1007/s10664-024-10442-y (DOI)001335071300002 ()2-s2.0-85206942985 (Scopus ID)
Forskningsfinansiär
Knowledge Foundation, 20180010
Tilgjengelig fra: 2024-10-30 Laget: 2024-10-30 Sist oppdatert: 2025-09-30bibliografisk kontrollert
3. The Capability of Code Review as a Communication Network
Åpne denne publikasjonen i ny fane eller vindu >>The Capability of Code Review as a Communication Network
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

Background

As a core practice in software engineering, the nature of code review has been researched extensively: Prior exploratory studies theorized that code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although prior exploratory studies lay an valuable foundation in understanding code review as communication network, the missing confirmatory counterpart leaves the theory's validity uncertain, limiting its credibility, practical applicability, and potential for further advancemcents.

Objective

This study aims to (1) formalize the theory of code review as a communication network explicit and (2) empirically test its validity by quantifying how widely and how quickly information can spread code review.

Method

We replicate an in silico experiment simulating information diffusion—the spread of information among participants—under best-case conditions across three open-source (Android, Visual Studio Code, React) and three closed-source code review systems (Microsoft, Spotify, Trivago) each modelled as a communication network. By measuring the number of reachable participants and the minimal topological and temporal distances, we quantify how widely and how quickly information can spread through code review.

Results

We find that code review networks can, under best-case conditions, support both wide and fast information diffusion, even in large-scale systems such as Microsoft’s internal code review platform. This confirms core assumptions of the theory of code review as a communication network. However, this capability is not uniformly present across all systems. Notably, we observe substantial differences between open-source and closed-source settings: open-source projects tend to achieve faster diffusion, while closed-source systems enable information to reach a broader share of participants.

Emneord
code review, simulation, replication, theory, communication network
HSV kategori
Forskningsprogram
Programvaruteknik
Identifikatorer
urn:nbn:se:bth-28566 (URN)
Tilgjengelig fra: 2025-09-01 Laget: 2025-09-01 Sist oppdatert: 2025-09-30bibliografisk kontrollert
4. Measuring Information Diffusion in Code Review at Spotify
Åpne denne publikasjonen i ny fane eller vindu >>Measuring Information Diffusion in Code Review at Spotify
Vise andre…
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

Background

Code review, a core practice in software engineering, has been widely studied as a collaborative process, with prior work suggesting it functions as a communication network. Despite its popularity, this theory has not been formalized and remains untested, limiting its practical and theoretical significance.

Objective

This study aims to (1) formalize the theory of code review as a communication network explicit and (2) empirically test its validity by quantifying the extent of information diffusion---the spread of information---in code review across social, organizational, and software architectural boundaries.

Method

We conduct a large-scale empirical analysis of 220,733 code reviews by 2,246 developers at Spotify during 2019. We conceptualize information diffusion along three distinct boundaries: social (dissimilarity among review participants), organizational (involvement of developers across teams), and architectural (interconnections among the components under review).

Results

We find that over 99.6% of review pairs have completely distinct participant sets, indicating high diffusion across social boundaries. Approximately 18% of code reviews involve developers from multiple teams, evidencing nontrivial diffusion across organizational boundaries. Of the 5.82% of code reviews linked to others, 99.0% span distinct repositories, reflecting architectural diffusion.

Conclusion

The substantial diffusion of information across social, organizational, and architectural boundaries empirically supports the theory of code review as a communication network. These findings indicate that code review plays a role not only in quality assurance, but also in enabling communication and coordination in large-scale, distributed software projects. They further support its use as a measurable proxy for cross-border collaboration in the context of tax compliance, but also raise concerns about the impact of integrating LLMs on its communicative function.

Emneord
code review, theory, communication network, information diffusion
HSV kategori
Forskningsprogram
Programvaruteknik
Identifikatorer
urn:nbn:se:bth-28564 (URN)
Tilgjengelig fra: 2025-09-01 Laget: 2025-09-01 Sist oppdatert: 2025-09-30bibliografisk kontrollert
5. Taxing Collaborative Software Engineering: The Challenges for Tax Compliance in Software Engineering
Åpne denne publikasjonen i ny fane eller vindu >>Taxing Collaborative Software Engineering: The Challenges for Tax Compliance in Software Engineering
Vise andre…
2024 (engelsk)Inngår i: IEEE Software, ISSN 0740-7459, E-ISSN 1937-4194, Vol. 41, nr 4, s. 143-150Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

The engineering of complex software systems is often the result of a highly collaborative effort. However, collaboration within a multinational enterprise has an overlooked legal implication when developers collaborate across national borders: It is taxable. In this article, we discuss the unsolved problem of taxing collaborative software engineering across borders.

sted, utgiver, år, opplag, sider
IEEE Computer Society, 2024
Emneord
Software, Software engineering, Finance, Collaboration, Codes, Pricing, Collaborative software
HSV kategori
Identifikatorer
urn:nbn:se:bth-26585 (URN)10.1109/MS.2023.3346646 (DOI)001241761200006 ()2-s2.0-85164779874 (Scopus ID)
Forskningsfinansiär
Knowledge Foundation, 20180010
Tilgjengelig fra: 2024-06-26 Laget: 2024-06-26 Sist oppdatert: 2025-09-30bibliografisk kontrollert
6. Quo Vadis, Code Review?: Exploring the Future of Code Review
Åpne denne publikasjonen i ny fane eller vindu >>Quo Vadis, Code Review?: Exploring the Future of Code Review
Vise andre…
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

Code review has long been a core practice in collaborative software engineering. In this research, we explore how practitioners reflect on code review today and what changes they anticipate in the near future. We then discuss the potential long-term risks of these anticipated changes for the evolution of code review and its role in collaborative software engineering.

Emneord
code review, survey, artificial intelligence, collaborative software engineering
HSV kategori
Forskningsprogram
Programvaruteknik
Identifikatorer
urn:nbn:se:bth-28568 (URN)
Tilgjengelig fra: 2025-09-01 Laget: 2025-09-01 Sist oppdatert: 2025-10-16bibliografisk kontrollert

Open Access i DiVA

fulltext(1265 kB)158 nedlastinger
Filinformasjon
Fil FULLTEXT02.pdfFilstørrelse 1265 kBChecksum SHA-512
21d24c63354b53f43fa18fc3feed894724bfe162c4133dc42a2decd0cb74933498da140b3828120d16a749a3a16be269308bd3d474163019d1556eae5356c08d
Type fulltextMimetype application/pdf

Søk i DiVA

Av forfatter/redaktør
Dorner, Michael
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 158 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 1922 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf