Software engineering for scientific big data analysis
2019 (English)In: GigaScience, E-ISSN 2047-217X, Vol. 8, no 5, article id giz054
Article, review/survey (Refereed) Published
Abstract [en]
The increasing complexity of data and analysis methods has created an environment where scientists, who may not have formal training, are finding themselves playing the impromptu role of software engineer. While several resources are available for introducing scientists to the basics of programming, researchers have been left with little guidance on approaches needed to advance to the next level for the development of robust, large-scale data analysis tools that are amenable to integration into workflow management systems, tools, and frameworks. The integration into such workflow systems necessitates additional requirements on computational tools, such as adherence to standard conventions for robustness, data input, output, logging, and flow control. Here we provide a set of 10 guidelines to steer the creation of command-line computational tools that are usable, reliable, extensible, and in line with standards of modern coding practices.
Place, publisher, year, edition, pages
Oxford University Press, 2019. Vol. 8, no 5, article id giz054
Keywords [en]
software development, big data, workflow, standards, data analysis, coding, software engineering, scientific software, integration systems, computational tools
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-390339DOI: 10.1093/gigascience/giz054ISI: 000474856100022PubMedID: 31121028OAI: oai:DiVA.org:uu-390339DiVA, id: diva2:1341613
Funder
EU, Horizon 2020, 6542412019-08-092019-08-092023-02-06Bibliographically approved