Preparation and analysis of multiple source industrial process data
Number of Authors: 4
2005 (English)Report (Refereed)
Industrial process data is often stored in a wide variety of formats and in several different repositories. Efficient methodologies and tools for data preparation and merging are critical for efficient analysis of such data. Experience shows that data analysis projects involving industrial data often spend the major part of their effort on these tasks, leaving little room for model development and generating applications. This paper identifies and classifies the needs and individual steps in data preparation of industrial data. A methodology for data preparation specifically suited for the domain is proposed and a practically useful set of primitive operations to support the methodology is defined. Finally, a proof of concept data preparation system implementing the proposed operations and a scripting facility to support the iterations in the methodology is presented along with a discussion of necessary and desirable properties of such a tool.
Place, publisher, year, edition, pages
Swedish Institute of Computer Science , 2005, 1. , 25 p.
SICS Technical Report, ISSN 1100-3154 ; 2005:10
Data Preparation Methodology, Multiple Source Data Merging, Data Analysis, Data Mining, Data Cleaning, Data Preprocessing
Computer and Information Science
IdentifiersURN: urn:nbn:se:ri:diva-14277OAI: oai:DiVA.org:ri-14277DiVA: diva2:1035565