Snippet Generation for Provenance Workflows
Independent thesis Advanced level (degree of Master (Two Years)), 30 credits / 45 HE creditsStudent thesis
Scientists often need to know how data was derived in addition to what it is. The detailed tracking of data transformation or provenance allows result reproducibility, knowledge reuse and data analysis. Scientific workflows are increasingly being used to represent provenance as they are capable of recording complicated processes at various levels of detail. In context of knowledge reuse and sharing; search technology is of paramount importance specially considering the huge and ever increasing amount of scientific data. It is computationally hard to produce a single exact answer to the user's query due to sheer volume and complicated structure of provenance. One solution to this difficult problem is to produce a list of candidate matches and let user select the most relevant result. Here search result presentation becomes very important as the user is required to make the final decision by looking at the workflows in the result list. Presentation of these candidate matches needs to be brief, precise, clear and revealing. This is a challenging task in case of workflows as they contain textual content as well as graphical structure. Current workflow search engines such as Yahoo Pipes! or myExperiment ignore the actual workflow specification and use metadata to create summaries. Workflows which lack metadata do not make good summaries even if they are useful and relevant as search criteria. This work investigates the possibility of creating meaningful and usable summaries or snippets based on structure and specification of workflows. We shall (1) present relevant published work done regarding snippet building techniques (2) explain how we mapped current techniques to our work (3) describe how we identified techniques from interface design theory in order to make usable graphical interface (4) present implementation of two new algorithms for workflow graph compression and their complexity analysis (5) identify future work in our implementation and outline open research problems in snippet building field.
Place, publisher, year, edition, pages
2011. , 67 p.
provenance, snippets, graph compression
IdentifiersURN: urn:nbn:se:liu:diva-71028ISRN: LIU-IDA/LITH-EX-A--11/032--SEOAI: oai:DiVA.org:liu-71028DiVA: diva2:444099
Subject / course
2011-08-30, Charles Babbage, Linköping Universitet, Linköping, 19:24 (English)