Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
LenticularFS: scalable hierarchical filesystem for the cloud
KTH, School of Information and Communication Technology (ICT).
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The Hadoop platform is the most common solution to handle the explosion of big-data that both companies and research institutions are facing. In order to store such data, the Hadoop platform provides HDFS, a scalable distributed filesystem which runs on commodity hardware and enables linear scalability by adding new storage nodes. While storage capacity of the system can be increased by adding new storage nodes, the component that handles metadata for the filesystem, the namenode, is a single point of failure and cannot easily replaced or linearly scaled. The Hops projects provides an alternative implementation of the namenode, which increases performance and scalability by storing metadata on an external distributed NewSQL database called MySQL Cluster. With the new architecture, the system is much more scalable and can transparently manage the failover of namenodes which are now stateless components. HopsFS is, however, still limited to running within a single datacenter which can cause severe outages in case the entire datacenter becomes unavailable. Cloud native storage systems, such as Amazon’s Simple Storage Service (S3), solve this problem by replicating data across different, geographically distant datacenters, so that the failure of any given zone does not cause data unavailability. The objective of this thesis is to enable HopsFS to work across geographical regions while, as far as possible, maintaining the semantics of a POSIX-style hierarchical filesystem. We leverage asynchronous replication functionality provided by MySQL Cluster to obtain replication of metadata across geographical regions and we present a detailed analysis on how to maintain the consistency properties of HDFS in such an environment. Furthermore, we analyze the issue of split brain scenarios and propose a way for namenodes to detect this condition and continue operating correctly. Finally, we discuss the changes to the codebase which are required to implement the proposed plan.

Place, publisher, year, edition, pages
2017. , p. 66
Series
TRITA-ICT-EX ; 2017:147
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-219604OAI: oai:DiVA.org:kth-219604DiVA, id: diva2:1164146
Subject / course
Computer Science
Educational program
Master of Science - Computer Science
Supervisors
Examiners
Available from: 2017-12-14 Created: 2017-12-10 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(1991 kB)54 downloads
File information
File name FULLTEXT01.pdfFile size 1991 kBChecksum SHA-512
6bd5ad330672ccdce6ff73b69da324cb22039cfdb5d57a618c6854a65b0063b172db6fc5ceff5c95d6866383e2fc927d1ff7faae18553386368f53af721c64df
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 54 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 98 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf