Change search
ReferencesLink to record
Permanent link

Direct link
Comparison and Implementation of Query Containment Algorithms for XPath
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Jämförelse och implementation av Query Containment-algoritmer för XPath (Swedish)
Abstract [en]

This thesis investigates the practical aspects of implementing Query Containment algorithms for the query language XPath. Query Containment is the problem to decide if the results of one query are a subset of the results of another query for any database. Query Containment algorithms can be used for the purpose of optimising the querying process in database systems. Two algorithms have been implemented and compared, The Canonical Model and The Homomorphism Technique. The algorithms have been compared with respect to speed, ease of implementation, accuracy and usability in database systems. Benchmark tests were developed to measure the execution times of the algorithms on a specific set of queries. A simple database system was developed to investigate the performance gain of using the algorithms. It was concluded that The Homomorphism Technique outperforms The Canonical Model in every test case with respect to speed. The Canonical Model is however more accurate than The Homomorphism Technique. Both algorithms were easy to implement, but The Homomorphism Technique was easier. In the database system, there was performance to be gained by using Query Containment algorithms for a certain type of queries, but in most cases there was a performance loss. A database system that utilises Query Containment algorithms for optimisation would for every issued query have to evaluate if such an algorithm should be used.

Abstract [sv]

Denna rapport undersöker de praktiska aspekterna av att implementera Query Containment-algoritmer för queryspråket XPath. Query Containment är problemet att avgöra om resultaten av en query är en delmängd av resultaten av en annan query, oavsett databas. Query Containment-algoritmer kan användas för ändamålet att optimera queryingprocessen i databassystem. Två algoritmer har implementerats och jämförts, The Canonical Model och The Homomorphism Technique. Algoritmerna har jämförts med avseende på hastighet, lätthet att implementera, exakthet och användbarhet i riktiga databassystem. Prestandatester utvecklades för att mäta exekveringstider för algoritmerna på specifikt framtagna queries. Ett enkelt databassystem utvecklades för att undersöka prestandavinsten av att använda algoritmerna. Slutsatsen att The Homomorphism Technique presterar bättre än The Canonical Model i samtliga testfall med avseende på hastighet drogs. The Canonical Model är dock mer exakt än The Homomorphism Technique. Båda algoritmerna var lätta att implementera, men The Homomorphism Technique var lättare. I databassystemet fanns det en prestandavinst i att använda Query Containment-algoritmer för en viss typ av queries, men i de flesta fall var det en prestandaförlust. Ett databassystem som använder Query Containment-algoritmer för optimering bör för varje query avgöra om en sådan algoritm ska användas.

Place, publisher, year, edition, pages
2016. , 53 p.
Keyword [en]
Query Containment, XPath, Implementation, The Canonical Model, The Homomorphism Technique
National Category
Computer Science
URN: urn:nbn:se:kth:diva-186467OAI: diva2:927368
Educational program
Master of Science in Engineering - Computer Science and Technology
2016-05-04, E31, Lindstedtsvägen 3, Stockholm, 16:35 (Swedish)
Available from: 2016-05-18 Created: 2016-05-11 Last updated: 2016-05-18Bibliographically approved

Open Access in DiVA

fulltext(933 kB)16 downloads
File information
File name FULLTEXT01.pdfFile size 933 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Wåreus, LinusWällstedt, Max
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 16 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 57 hits
ReferencesLink to record
Permanent link

Direct link