Change search
ReferencesLink to record
Permanent link

Direct link
KTHFS Orchestration: PaaS orchestration for Hadoop
KTH, School of Information and Communication Technology (ICT).
2013 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Platform as a Service (PaaS) has produced a huge impact on how we can offer easy and scalable software that adapts to the needs of the users. This has allowed the possibility of systems being capable to easily configure themselves upon the demand of the customers. Based on these features, a large interest has emerged to try and offer virtualized Hadoop solutions based on Infrastructure as a Service (IaaS) architectures in order to easily deploy completely functional Hadoop clusters in platforms like Amazon EC2 or OpenStack.

Throughout the thesis work, it was studied the possibility of enhancing the capabilities of KTHFS, a modified Hadoop platform in development; to allow automatic configuration of a whole functional cluster on IaaS platforms. In order to achieve this, we will study different proposals of similar PaaS platforms from companies like VMWare or Amazon EC2 and analyze existing node orchestration techniques to configure nodes in cloud providers like Amazon or Openstack and later on automatize this process.

This will be the starting point for this work, which will lead to the development of our own orchestration language for KTHFS and two artifacts (i) a simple Web Portal to launch the KTHFS Dashboard in the supported IaaS platforms, (ii) an integrated component in the Dashboard in charge of analyzing a cluster definition file, and initializing the configuration and deployment of a cluster using Chef.

Lastly, we discover new issues related to scalability and performance when integrating the new components to the Dashboard. This will force us to analyze solutions in order to optimize the performance of our deployment architecture. This will allow us to reduce the deployment time by introducing a few modifications in the architecture.

Finally, we will conclude with some few words about the on-going and future work.

Place, publisher, year, edition, pages
2013. , 95 p.
Trita-ICT-EX, 2013:175
Keyword [en]
KTHFS, HDFS Orchestration, Chef, Jclouds, Amazon EC2, Openstack, PaaS
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-128935OAI: diva2:648868
Educational program
Master of Science - Software Engineering of Distributed Systems
Available from: 2013-09-17 Created: 2013-09-17 Last updated: 2013-09-17Bibliographically approved

Open Access in DiVA

fulltext(1829 kB)476 downloads
File information
File name FULLTEXT01.pdfFile size 1829 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 476 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 174 hits
ReferencesLink to record
Permanent link

Direct link