Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data Analysis on Hadoop - finding tools and applications for Big Data challenges
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2015 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

With the increasing number of data generated each day, recent development in software, provide the tools needed to tackle the challenges of the so called Big Data era. This project introduces some of these platforms, in particular it focuses on platforms for data analysis and query tools that works alongside Hadoop. In the first part of this project, the Hadoop framework and its main components, MapReduce, YARN and HDFS are introduced. This is followed by giving an overview of seven platforms that are part of the Hadoop ecosystem. In this overview we exposed their key features, components, programming model and architecture. The following chapter introduced 12 parameters that are used to compare these platforms side by side and it ends with a summary and discussion where they are divided into several classes according to their usage, use cases and data environment. In the last part of this project, a web log analysis, belonging to one of Sweden's top newspapers, was done using Apache Spark, one of the platforms analyzed. The purpose of this analysis was to showcase some of the features of Spark while doing an exploratory data analysis.

Place, publisher, year, edition, pages
2015. , 79 p.
Series
IT, 15053
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-260557OAI: oai:DiVA.org:uu-260557DiVA: diva2:847616
Educational program
Master Programme in Computer Science
Supervisors
Examiners
Available from: 2015-08-20 Created: 2015-08-20 Last updated: 2015-08-20Bibliographically approved

Open Access in DiVA

fulltext(1347 kB)833 downloads
File information
File name FULLTEXT01.pdfFile size 1347 kBChecksum SHA-512
d1e40269dc1ca1e595cac9b0dfdcfc51a988c472b01857ad54664ffed64de106069c5873d4c4b1419366d8141e6c860ba2d7085e6d25fe9010a7e56e4540ddd8
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 833 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1835 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf