Change search
ReferencesLink to record
Permanent link

Direct link
Semantic Web Queries over Scientific Data
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computing Science. (UDBL)ORCID iD: 0000-0002-7965-9128
2016 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Semantic Web and Linked Open Data provide a potential platform for interoperability of scientific data, offering a flexible model for providing machine-readable and queryable metadata. However, RDF and SPARQL gained limited adoption within the scientific community, mainly due to the lack of support for managing massive numeric data, along with certain other important features – such as extensibility with user-defined functions, query modularity, and integration with existing environments and workflows.

We present the design, implementation and evaluation of Scientific SPARQL – a language for querying data and metadata combined, represented using the RDF graph model extended with numeric multidimensional arrays as node values – RDF with Arrays. The techniques used to store RDF with Arrays in a scalable way and process Scientific SPARQL queries and updates are implemented in our prototype software – Scientific SPARQL Database Manager, SSDM, and its integrations with data storage systems and computational frameworks. This includes scalable storage solutions for numeric multidimensional arrays and an efficient implementation of array operations. The arrays can be physically stored in a variety of external storage systems, including files, relational databases, and specialized array data stores, using our Array Storage Extensibility Interface. Whenever possible SSDM accumulates array operations and accesses array contents in a lazy fashion.

In scientific applications numeric computations are often used for filtering or post-processing the retrieved data, which can be expressed in a functional way. Scientific SPARQL allows expressing common query sub-tasks with functions defined as parameterized queries. This becomes especially useful along with functional language abstractions such as lexical closures and second-order functions, e.g. array mappers.

Existing computational libraries can be interfaced and invoked from Scientific SPARQL queries as foreign functions. Cost estimates and alternative evaluation directions may be specified, aiding the construction of better execution plans. Costly array processing, e.g. filtering and aggregation, is thus preformed on the server, saving the amount of communication. Furthermore, common supported operations are delegated to the array storage back-ends, according to their capabilities. Both expressivity and performance of Scientific SPARQL are evaluated on a real-world example, and further performance tests are run using our mini-benchmark for array queries.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2016. , 214 p.
Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1104-2516 ; 121
Keyword [en]
RDF, SPARQL, Arrays, Query optimization, Second-order functions, Scientific workflows
National Category
Computer Science
Research subject
Computer Science with specialization in Database Technology
URN: urn:nbn:se:uu:diva-274856ISBN: 978-91-554-9465-0OAI: diva2:897986
Public defence
2016-03-23, Lecture hall 2446, Polacksbacken, Uppsala, 14:00 (English)
Available from: 2016-02-25 Created: 2016-01-26 Last updated: 2016-03-09Bibliographically approved

Open Access in DiVA

fulltext(1693 kB)232 downloads
File information
File name FULLTEXT01.pdfFile size 1693 kBChecksum SHA-512
Type fulltextMimetype application/pdf
Buy this publication >>

Search in DiVA

By author/editor
Andrejev, Andrej
By organisation
Computing ScienceDivision of Computing Science
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 232 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1373 hits
ReferencesLink to record
Permanent link

Direct link