Change search
ReferencesLink to record
Permanent link

Direct link
KTH, School of Information and Communication Technology (ICT).
2013 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

KTHFS is a highly available and scalable file system built from the version 0.24 of the Hadoop Distributed File system. It provides a platform to overcome the limitations of existing distributed file systems. These limitations include scalability of metadata server in terms of memory usage, throughput and its availability.

This document describes KTHFS architecture and how it addresses these problems by providing a well coordinated distributed stateless metadata server (or in our case, Namenode) architecture. This is backed with the help of a persistence layer such as NDB cluster. Its primary focus is towards High Availability of the Namenode.

It achieves scalability and recovery by persisting the metadata to an NDB cluster. All namenodes are connected to this NDB cluster and hence are aware of the state of the file system at any point in time.

In terms of High Availability, KTHFS provides Multi-Namenode architecture. Since these namenodes are stateless and have a consistent view of the metadata, clients can issue requests on any of the namenodes. Hence, if one of these servers goes down, clients can retry its operation on the next available namenode.

We next discuss the evaluation of KTHFS in terms of its metadata capacity for medium and large size clusters, throughput and high availability of the Namenode and an analysis of the underlying NDBcluster.

Finally, we conclude this document with a few words on the ongoing and future work in KTHFS.

Place, publisher, year, edition, pages
2013. , 73 p.
Trita-ICT-EX, 2013:30
Keyword [en]
Namenode, NDB cluster, MySQL cluster, KTHFS, HDFS, metadata, High Availability, Scalability, throughput
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-117918OAI: diva2:603878
Educational program
Master of Science - Software Engineering of Distributed Systems
Available from: 2013-03-21 Created: 2013-02-07 Last updated: 2013-03-21Bibliographically approved

Open Access in DiVA

fulltext(1915 kB)249 downloads
File information
File name FULLTEXT01.pdfFile size 1915 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 249 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 167 hits
ReferencesLink to record
Permanent link

Direct link