Change search
ReferencesLink to record
Permanent link

Direct link
Compaction Strategies in Apache Cassandra: Analysis of Default Cassandra stress model
Blekinge Institute of Technology, Faculty of Computing, Department of Communication Systems. (Telecommunications)
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Context. The present trend in a large variety of applications are ranging from the web and social networking to telecommunications, is to gather and process very large and fast growing amounts of information leading to a common set of problems known collectively as “Big Data”. The ability to process large scale data analytics over large number of data sets in the last decade proved to be a competitive advantage in a wide range of industries like retail, telecom and defense etc. In response to this trend, the research community and the IT industry have proposed a number of platforms to facilitate large scale data analytics. Such platforms include a new class of databases, often refer to as NoSQL data stores. Apache Cassandra is a type of NoSQL data store. This research is focused on analyzing the performance of different compaction strategies in different use cases for default Cassandra stress model. Objectives. The performance of compaction strategies are observed in various scenarios on the basis of three use cases, Write heavy- 90/10, Read heavy- 10/90 and Balanced- 50/50. For a default Cassandra stress model, so as to finally provide the necessary events and specifications that suggest when to switch from one compaction strategy to another. Methods. Cassandra single node network is deployed on a web server and its behavior of read and write performance with different compaction strategies is studied with read heavy, write heavy and balanced workloads. Its performance metrics are collected and analyzed. Results. Performance metrics of different compaction strategies are evaluated and analyzed. Conclusions. With a detailed analysis and logical comparison, we finally conclude that Level Tiered Compaction Strategy performs better for a read heavy (10/90) workload while using default Cassandra stress model , as compared to size tiered compaction and date tiered compaction strategies. And for Balanced Date tiered compaction strategy performs better than size tiered compaction strategy and date tiered compaction strategy.

Place, publisher, year, edition, pages
2016. , 33 p.
Keyword [en]
Big data platforms, Cassandra, NosSQL database.
National Category
URN: urn:nbn:se:bth-12850OAI: diva2:946772
External cooperation
Subject / course
ET2580 Master's Thesis (120 credits) in Electrical Engineering with emphasis on Telecommunication Systems
Educational program
ETATE Master of Science Programme in Electrical Engineering with emphasis on Telecommunication Systems
2016-05-31, J1640, Valhalavagen, 37141, Karlskrona, 13:00 (English)
Available from: 2016-07-06 Created: 2016-07-05 Last updated: 2016-07-06Bibliographically approved

Open Access in DiVA

fulltext(1403 kB)99 downloads
File information
File name FULLTEXT02.pdfFile size 1403 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ravu, Venkata Sathya Sita J S
By organisation
Department of Communication Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 99 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 75 hits
ReferencesLink to record
Permanent link

Direct link