Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Towards Automated, Context-Aware Management of Preservation Submissions
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.ORCID iD: 0000-0001-5137-3390
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.ORCID iD: 0000-0002-7477-0783
2018 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

Research in digital preservation field has realized the need for au-tomation of digital preservation activities. Without automation, the preservation of digital entities will be a complex and labor-intensive task. A middleware con-cept has been introduced by scholars to support automation of interactions be-tween content management systems and digital preservation systems. To boost the automation of workflows in the middleware, we introduce a component within the middleware, namely Context-aware Preservation Manager (CaPM), which is in charge of administration of the inner components of the middleware and the workflows for bi-directional interactions between content management systems and digital preservation systems. We describe the specifications of the Context-aware Preservation Manager and depict its inner components. Further, we explain about the processes that are improved, supported or can run automat-ically as a result of functionalities of Context-aware Preservation Manager.

Place, publisher, year, edition, pages
2018.
Keywords [en]
Long-term Digital Preservation, Automation, Middleware, Context-aware Preservation Manager
National Category
Information Systems
Research subject
Information systems
Identifiers
URN: urn:nbn:se:ltu:diva-76035OAI: oai:DiVA.org:ltu-76035DiVA, id: diva2:1351906
Conference
41th Information Systems Research Seminar in Scandinavia (IRIS 41), Odder, Denmark, August 5-8, 2018
Funder
EU, FP7, Seventh Framework Programme, 600826Available from: 2019-09-17 Created: 2019-09-17 Last updated: 2021-09-28Bibliographically approved
In thesis
1. Designing for Automated Digital Preservation: Model, Pre-Ingest, and Error Handling
Open this publication in new window or tab >>Designing for Automated Digital Preservation: Model, Pre-Ingest, and Error Handling
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

With the rapid increase in the amount and complexity of data that is needed to be preserved, manual preservation activities produce complex, lengthy, and costly processes. Therefore, automation of preservation processes, together with modeling of workflows and streamlining, can help reduce costs and enhance the focus on preservation processes.  Accordingly, the research question is defined as: “How to establish an automated many-to-many interaction between Information Systems and digital preservation systems?”

This research proposes a model and instantiation of middleware as a standalone system, which could be hosted in the cloud, for bridging between ISs and DPSs including three sub-parts making both many-to-many capacity and automation of interactions possible: pre-ingest workflow, Context-aware Preservation Manager (CaPM), and error-handling workflow. A Design Science Research (DSR) approach was taken to conduct this research consisting of three design cycles to design and develop each of the three sub-parts of the solution artifact, i.e. the middleware.  The middleware consists of several action-based components and an administrative component (CaPM) which carries out the automation of the tasks in the middleware. The action-based components are designed to complete a pre-ingest workflow to prepare digital content sent from an information system to be transferred into a digital preservation system. The path for the pre-ingest workflow, i.e. which components are going to process the digital content and in what order, is automatically defined by CaPM according to the information system’s preservation policies. Standard interfaces are used for middleware’s internal or external communications to promote its scalability in the long run as well as its capability of embedding additional workflows or processes developed in the future, e.g. post-access workflow.

An additional outcome of this research is proposing five design principles aiming to contribute to the knowledge for future design practices: DP1. Provide rule-based definition of workflow execution path so that the middleware affords IS to implement their preservation policy and metadata extraction requirements. DP2. Provide capability of executing alternative workflow routes so that the middleware affords IS to ensure a successful encapsulation and submission of SIP. DP3. Provide features for gathering preservation data in the middleware so that the middleware affords preservation planning support. DP4. Provide an automated error-handling workflow with compensating action so that the middleware affords to minimize manual intervention in case of errors in a workflow. DP5. Provide capability of executing concurrent workflows so that the middleware affords IS and DPS many-to-many interactions via the middleware.

The results of this thesis contribute to the state-of-the-art in a few aspects:

  • Compared to existing solutions, such as pre-ingest tool developed for Finnish National Archives and UAM for Estonia, that need to be installed on a user’s system, integration with the middleware is carried out with less complexity. This is achieved by designing the middleware as a standalone system that could be hosted in the cloud along with using standard communication interfaces, which further make the middleware adaptable to changes or upgrades in the environment it operates in. Such capability of the middleware in handling many-to-many interactions goes beyond what was introduced in previous middleware architectures for Digital Preservation System’s integration with Information Systems.
  • The middleware solution for pre-ingest in this thesis, in comparison with the similar recent solutions, promotes automation capabilities especially for preserving complex digital content (e.g. databases, workflows), automatic execution of the pre-ingest workflow, or in case of a need for using multiple external digital preservation solutions or services.
  • CaPM monitors the execution of workflows and can update or abort a workflow path if needed. An aborted workflow caused by an error/failure will automatically be replaced by an error-handling workflow with compensation action, hence increasing the level of automation. Automation of such functionalities, as well as the approach for handling errors, has not been applied in previous tools.
  • CaPM can also contribute to the current stream of research on decisions making regarding preservation planning and strategies by providing logged data about the digital objects passing through the middleware.

While the solution artifact of this research provides middleware to perform as a bridge for automated many-to-many interactions between information systems and digital preservation systems, the resulted design and implementation of the middleware components cover only one direction of such interaction, from information system to digital preservation system (pre-ingest).

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2020. p. 149
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
Keywords
Long-term Digital Preservation, Design Science Research, Information System
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Information Systems, Social aspects
Research subject
Information systems
Identifiers
urn:nbn:se:ltu:diva-78243 (URN)978-91-7790-565-3 (ISBN)978-91-7790-566-0 (ISBN)
Public defence
2020-06-03, A109, Luleå University of Technology, A building, 13:00 (English)
Opponent
Supervisors
Projects
ForgetIT - European Commission FP7
Funder
EU, FP7, Seventh Framework Programme
Available from: 2020-03-31 Created: 2020-03-30 Last updated: 2020-05-12Bibliographically approved

Open Access in DiVA

fulltext(264 kB)353 downloads
File information
File name FULLTEXT01.pdfFile size 264 kBChecksum SHA-512
c9ce587cb623c29c70b67dbed2dc5b1014deed71ab28b6a54069782be56688a07723fb7302460f0a50ab3f0150da45b9b533091411d5969d7c611be03fcacc4c
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Westerlund, ParvanehAndersson, IngemarPäivärinta, Tero
By organisation
Computer Science
Information Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 353 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 195 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf