Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Verification of linear scalability of a business Big Data platform against the Queueing Networks model
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Ensuring that software built on top of distributed systems for Big Data has good scalability properties is crucial for the design of long-lasting and reliable products. The purpose of this Master Thesis is to investigate and characterize scalability of a business Big Data platform, URights, developed by IBM in cooperation with a French association, SACEM. This work focuses on the initial step in URights called Ingestion. Scalability is examined in the context of proportionally growing workloads and resources. Following the study, a set of recommendations for the platform is formulated. Applicability of different techniques of performance evaluation for assessment of scalability is examined. Three methods of evaluation are used in this work. First, a mathematical analysis based on Queueing Networks (QNs) is conducted. Then, a simulation engine extending the QN model is designed and a set of simulations is run. Finally, an empirical evaluation is conducted in a test environment. Due to the scarcity of data and stark differences between the test and production environments, the reliability of the empirical results is questionable. Mathematical analysis and simulations suggest that fine-grained parallelism is desirable for low and middle workloads. Also, if more data must be processed, it is better to increase the frequency of batches used in URights rather than their sizes. Finally, a potential bottleneck operation is identified. The use of different methods of evaluation allows us to progressively formulate and investigate more complicated questions as well as to observe the limits and benefits of each tool.

Abstract [sv]

Mjukvara som styr distribuerade system måste ha god skalbarhet , ett avgörande krav för hållbara och pålitliga produkter. Syftet med detta examensarbete är att undersöka och beskriva skalbarheten hos Big Data plattformen URights, som har utvecklats av IBM tillsammans med den fransk organisationen SATEM. Arbetet fokuserar på URights första steg som kallas ingestion, och skalbarhet undersöks genom att låta resurserna växa linjärt i proportion till ökade belastningar. Detta leder till ett antal rekommendationer för fortsatt utveckling av plattformen. Ett antal olika sätt att mäta prestanda tillsammans med skalbarhet undersöks. Tre utvärderingsmetoder används i rapporten. Först genomförs en matematisk analys baserad på Queuing Networks (QN). Sedan utvecklas en simulationsmodell ovanpå denna och ett antal simulationer av systemet körs. Slutligen utförs en empirisk utvärdering i en testmiljö. På grund av brist på data och stora skillnader mellan test- och produktionsmiljöerna kan de empiriska resultaten inte ses som helt säkerställda. Den matematiska analysen och simulationerna antyder att finkornig (fine-grained) parallellitet är att föredra för små och medelstora belastningar. Om mer data skall behandlas är det bättre att öka antalet minnesdelar i URights, snarare än att öka deras storlek. En tänkbar flaskhalsoperation identifieras också. Sammanfattningsvis leder användandet av flera utvärderingsmetoder till utökade möjligheter att gradvis formulera och pröva mer och mer komplicerade frågeställningar, samtidigt som varje verktygs för-och nackdelar tas i akt.

Place, publisher, year, edition, pages
2019. , p. 66
Series
TRITA-EECS-EX ; 2019:69
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-249636OAI: oai:DiVA.org:kth-249636DiVA, id: diva2:1304951
External cooperation
IBM France
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
Available from: 2019-05-15 Created: 2019-04-15 Last updated: 2019-05-15Bibliographically approved

Open Access in DiVA

fulltext(1469 kB)18 downloads
File information
File name FULLTEXT01.pdfFile size 1469 kBChecksum SHA-512
f6a83fb3080223c144b60b87000ea9bca44c45513b04169667965bf44f913333fcc968dddcb825ecd1cd278b38fb81d804115d4522d84433a0bc54c8047b10fd
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 18 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 51 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf