Change search
ReferencesLink to record
Permanent link

Direct link
Automation of a Data Analysis Pipeline for High-content Screening Data
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, The Institute of Technology.
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, The Institute of Technology.
2015 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

High-content screening is a part of the drug discovery pipeline dealing with the identification of substances that affect cells in a desired manner. Biological assays with a large set of compounds are developed and screened and the output is generated with a multidimensional structure. Data analysis is performed manually by an expert with a set of tools and this is considered to be too time consuming and unmanageable when the amount of data grows large. This thesis therefore investigates and proposes a way of automating the data analysis phase through a set of machine learning algorithms. The resulting implementation is a cloud based application that can support the user with the selection of which features that are relevant for further analysis. It also provides techniques for automated processing of the dataset and training of classification models which can be utilised for predicting sample labels. An investigation of the workflow for analysing data was conducted before this thesis. It resulted in a pipeline that maps the different tools and software to what goal they fulfil and which purpose they have for the user. This pipeline was then compared with a similar pipeline but with the implemented application included. This comparison demonstrates clear advantages in contrast to previous methodologies in that the application will provide support to work in a more automated way of performing data analysis.

Place, publisher, year, edition, pages
2015. , 77 p.
Keyword [en]
Machine learning, Datateknik
Keyword [sv]
National Category
Media and Communication Technology
URN: urn:nbn:se:liu:diva-122913ISRN: LIU-ITN-TEK-A--15/053--SEOAI: diva2:874880
Subject / course
Computer Engineering
Available from: 2015-11-30 Created: 2015-11-30 Last updated: 2015-11-30Bibliographically approved

Open Access in DiVA

fulltext(4488 kB)87 downloads
File information
File name FULLTEXT01.pdfFile size 4488 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Bergström, SimonIvarsson, Oscar
By organisation
Media and Information TechnologyThe Institute of Technology
Media and Communication Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 87 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 370 hits
ReferencesLink to record
Permanent link

Direct link