Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations
Linköping University, Department of Thematic Studies, Technology and Social Change. Linköping University, Faculty of Arts and Sciences. Division of Science, Technology, and Society, Chalmers Technical University, Göteborg, Sweden.ORCID iD: 0000-0002-7206-2046
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-0176-5852
Linköping University, Department of Thematic Studies, The Department of Gender Studies. Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Thematic Studies, Technology and Social Change. Department of Thematic Studies – Gender Studies, Linköping University, Linköping, Sweden.ORCID iD: 0000-0001-5041-5018
2025 (English)In: Big Data and Society, E-ISSN 2053-9517, Vol. 12, no 2Article in journal (Refereed) Published
Abstract [en]

Synthetic data is increasingly used as a substitute for real data due to ethical, legal, and logistical reasons. However, the rise of synthetic data also raises critical questions about its entanglement with the politics of classification and the reproduction of social norms and categories. This paper aims to problematize the use of synthetic data by examining how its production is intertwined with the maintenance of certain worldviews and classifications. We argue that synthetic data, like real data, is embedded with societal biases and power structures, leading to the reproduction of existing social inequalities. Through empirical examples, we demonstrate how synthetic data tends to highlight majority elements as the “normal” and minimize minority elements, and that the slight changes to the data structures that create synthetic data will also inevitably result in what we term “intersectional hallucinations.” These hallucinations are inherent to synthetic data and cannot be entirely eliminated without compromising the purpose of creating synthetic datasets. We contend that decisions about synthetic data involve determining which intersections are essential and which can be disregarded, a practice which will imbue these decisions with norms and values. Our study underscores the need for critical engagement with the mathematical and statistical choices in synthetic data production and advocates for careful consideration of the ontological and political implications of these choices during curatorial style production of synthetic structured data.

Place, publisher, year, edition, pages
2025. Vol. 12, no 2
National Category
Information Systems, Social aspects Other Computer and Information Science
Identifiers
URN: urn:nbn:se:liu:diva-212985DOI: 10.1177/20539517251318289OAI: oai:DiVA.org:liu-212985DiVA, id: diva2:1951873
Funder
Wallenberg AI, Autonomous Systems and Software Program – Humanity and Society (WASP-HS)Available from: 2025-04-14 Created: 2025-04-14 Last updated: 2025-04-14

Open Access in DiVA

fulltext(1281 kB)57 downloads
File information
File name FULLTEXT01.pdfFile size 1281 kBChecksum SHA-512
0174e1fd904e8e44238fa4c00189d4c19b42ab2a09d34a63534f298b9c30c8c49b92b6b8890af3ebbd3fde1e89ae7235b15054cacf987735aac390523d557d87
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Lee, FrancisHajisharif, SaghiJohnson, Ericka
By organisation
Technology and Social ChangeFaculty of Arts and SciencesMedia and Information TechnologyFaculty of Science & EngineeringThe Department of Gender Studies
In the same journal
Big Data and Society
Information Systems, Social aspectsOther Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 60 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 159 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf