Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Object serialization vs relational data modelling in Apache Cassandra: a performance evaluation
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.
2015 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Context. In newer database solutions designed for large-scale, cloud-based services, database performance is of particular concern as these services face scalability challenges due to I/O bottlenecks. These issues can be alleviated through various data model optimizations that reduce I/O loads. Object serialization is one such approach.

Objectives. This study investigates the performance of serialization using the Apache Avro library in the Cassandra database. Two different serialized data models are compared with a traditional relational database model.

Methods. This study uses an experimental approach that compares read and write latency using Twitter data in JSON format.

Results. Avro serialization is found to improve performance. However, the extent of the performance benefit is found to be highly dependent on the serialization granularity defined by the data model.

Conclusions. The study concludes that developers seeking to improve database throughput in Cassandra through serialization should prioritize data model optimization as serialization by itself will not outperform relational modelling in all use cases. The study also recommends that further work is done to investigate additional use cases, as there are potential performance issues with serialization that are not covered in this study.

Place, publisher, year, edition, pages
2015. , 38 p.
Keyword [en]
Distributed systems organizing principles, information storage technologies, data structures and algorithms for data management
National Category
Computer Science
Identifiers
URN: urn:nbn:se:bth-10391OAI: oai:DiVA.org:bth-10391DiVA: diva2:839521
External cooperation
Telefonaktiebolaget L. M. Ericsson
Subject / course
DV1478 Bachelor Thesis in Computer Science
Educational program
DVGDS Computer and System Science
Supervisors
Examiners
Available from: 2015-08-05 Created: 2015-07-02 Last updated: 2015-08-05Bibliographically approved

Open Access in DiVA

fulltext(914 kB)661 downloads
File information
File name FULLTEXT02.pdfFile size 914 kBChecksum SHA-512
70f7e7c6bead54b921864acfc888c6cd38eb8e6e13863bf764d044817daf74d452bf94ec1889eddbe80357df261f85979ebb126c71fba49b97365e2dcc0fa2b3
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science and Engineering
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 661 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 690 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf