Change search
ReferencesLink to record
Permanent link

Direct link
Genium Data Store: Distributed Data store
KTH, School of Information and Communication Technology (ICT).
2013 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In recent years the need for distributed data storage has led the way to design new systems in a large-scale environment. The growth of unbounded stream of data, the necessity to store and analyze it in real time, reliably, scalable and fast are the reasons for appearance of such systems in financial sector, stock exchange Nasdaq OMX especially. Furthermore, internally designed totally ordered reliable message bus is used in Nasdaq OMX for almost all internal subsystems. Theoretical and practical extensive studies on reliable totally ordered multicast were made in academia and it was proven to serve as a fundamental block in construction of distributed fault-tolerant applications. In this work, we are leveraging NOMX low-latency reliable totally ordered message bus with a capacity of at least 2 million messages per second to build high performance distributed data store. The data operations consistency can be easily achieved by using the messaging bus as it forwards all messages in reliable total order fashion. Moreover, relying on the reliable totally ordered messaging, active in-memory replication support for fault tolerance and load balancing is integrated. Consequently, the prototype was developed using production environment requirements to demonstrate its feasibility. Experimental results show a great scalability and performance serving around 400,000 insert operations per second over 6 data nodes that can be served with 100 microseconds latency. Latency for single record read operations are bound to sub-half millisecond, while data ranges are retrieved with sub-100 Mbps capacity from one node. Moreover, performance improvements under a greater number of data store nodes are shown for both writes and reads. It is concluded that uniform totally ordered sequenced input data can be used in real time for large-scale distributed data storage to maintain strong consistency, fault-tolerance and high performance.

Place, publisher, year, edition, pages
2013. , 64 p.
TRITA-ICT-EX, 2013:182
National Category
Computer and Information Science
URN: urn:nbn:se:kth:diva-141552OAI: diva2:697383
Educational program
Master of Science - Distributed Computing
Available from: 2014-02-20 Created: 2014-02-18 Last updated: 2014-02-20Bibliographically approved

Open Access in DiVA

fulltext(1369 kB)174 downloads
File information
File name FULLTEXT01.pdfFile size 1369 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 174 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 159 hits
ReferencesLink to record
Permanent link

Direct link