Change search
ReferencesLink to record
Permanent link

Direct link
Analyzing the impact of data compression in Hive
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2014 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Executing expensive queries over many large tables can be prohibitively time consuming in conventional relational databases. Hadoop and its data warehouse Hive is a powerful alternative for large scale data processing. Conventionally, data is stored in Hive without compression. There is value in storing the data with compression, if the overhead of compression does not negatively impact the query processing time. This paper describes through experiments using imports, transformations and exports of Hive data in various file formats and with different compression techniques how this can be achieved.

Place, publisher, year, edition, pages
2014. , 36 p.
IT, 14074
National Category
Engineering and Technology
URN: urn:nbn:se:uu:diva-269235OAI: diva2:882559
Educational program
Bachelor Programme in Computer Science
Available from: 2015-12-15 Created: 2015-12-15 Last updated: 2016-02-11Bibliographically approved

Open Access in DiVA

fulltext(733 kB)108 downloads
File information
File name FULLTEXT01.pdfFile size 733 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 108 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 665 hits
ReferencesLink to record
Permanent link

Direct link