Change search
ReferencesLink to record
Permanent link

Direct link
Persistence and Node FailureRecovery in Strongly Consistent Key-Value Datastore
KTH, School of Information and Communication Technology (ICT).
2012 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Consistency preservation of replicated data is a critical aspect for distributed databaseswhich are strongly consistent. Further, in fail-recovery model each process also needs todeal with the management of stable storage and amnesia [1]. CATS is a key/value datastore which combines the Distributed Hash Table (DHT) like scalability and selforganization and also provides atomic consistency of the replicated items. However beingan in memory data store with consistency and partition tolerance (CP), it suffers frompermanent unavailability in the event of majority failure.

The goals of this thesis were twofold (i) to implement disk persistent storage in CATS,which would allow the records and state of the nodes to be persisted on disk and (ii) todesign nodes failure recovery-algorithm for CATS which enable the system to run with theassumption of a Fail Recovery model without violating consistency.

For disk persistent storage two existing key/value databases LevelDB [2] and BerkleyDB[3] are used. LevelDB is an implementation of log structured merged trees [4] where asBerkleyDB is an implementation of log structured B+ trees [5]. Both have been used as anunderlying local storage for nodes and throughput and latency of the system with each isdiscussed. A technique to improve the performance by allowing concurrent operations onthe nodes is also discussed. The nodes failure-recovery algorithm is designed with a goalto allow the nodes to crash and then recover without violating consistency and also toreinstate availability once the majority of nodes recover. The recovery algorithm is based onpersisting the state variables of Paxos [6] acceptor and proposer and consistent groupmemberships.

For fault-tolerance and recovery, processes also need to copy records from the replicationgroup. This becomes problematic when the number of records and the amount of data ishuge. For this problem a technique for transferring key/value records in bulk is alsodescribed, and its effect on the latency and throughput of the system is discussed.

Place, publisher, year, edition, pages
2012. , 62 p.
Trita-ICT-EX, 2012:175
Keyword [en]
Datastore, Distributed Hash Table (DHT), Atomic Shared Registers, Disk Storage, Consistency, Failure Recovery, Paxos, Group Membership
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-121281OAI: diva2:617853
Educational program
Master of Science - Software Engineering of Distributed Systems
Available from: 2013-04-24 Created: 2013-04-24 Last updated: 2013-04-24Bibliographically approved

Open Access in DiVA

fulltext(1449 kB)247 downloads
File information
File name FULLTEXT01.pdfFile size 1449 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 247 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 100 hits
ReferencesLink to record
Permanent link

Direct link