Change search
ReferencesLink to record
Permanent link

Direct link
Evaluating Presto as an SQL on Hadoop solution: A Case at Truecaller
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Truecaller is a mobile application with over 200 million unique users worldwide. Every day truecaller stores over 1 billion rows of data that they use to analyse for improving their product. The data is stored in Hadoop, which is a framework for storing and analysing large amounts of data on a distributed file system. In order to be able to analyse these large amounts of data the analytics team needs a new solution for more lightweight, ad-hoc analysis. This thesis evaluates the performance of the query engine Presto to see if it meets the requirements to help the data analytics team at truecaller gain efficiency. By using a design-science methodology, Presto’s pros and cons are presented. Presto is recommended as a solution to be used together with the tools today for specific lightweight use cases for users that are familiar with the data sets used by the analytics team. Other solutions for future evaluation are also recommended before taking a final decision.Keywords: Hadoop, Big Data, Presto, Hive, SQL on Hadoop

Place, publisher, year, edition, pages
2016. , 54 p.
Keyword [en]
Social Behaviour Law
Keyword [sv]
Samhälls-, beteendevetenskap, juridik, Hadoop, Big Data, Presto, SQL on Hadoop
URN: urn:nbn:se:ltu:diva-47369Local ID: 4ed7b7af-a682-42de-8ab0-896781b4ba4cOAI: diva2:1020690
External cooperation
Subject / course
Student thesis, at least 15 credits
Educational program
Systems Sciences, bacheor's level

Validerat; 20160819 (global_studentproject_submitter)

Available from: 2016-10-04 Created: 2016-10-04 Last updated: 2016-10-14Bibliographically approved

Open Access in DiVA

fulltext(9574 kB)0 downloads
File information
File name FULLTEXT02.pdfFile size 9574 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ahmed, Sahir

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

ReferencesLink to record
Permanent link

Direct link