Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Performance Aspects of Databases and Virtualized Real-time Applications
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Context: High computing system performance depends on the interaction between software and hardware layers in modern computer systems. Two strong trends that effect different layers in computer systems are that single processors are now more or less completely replaced by multiprocessors, which are often organized into clusters, and virtualization of resources. The performance evaluation of different software on such physical and virtualized resources, is the focus of this thesis.

Objectives: The objectives of this thesis are to investigate the performance evaluation of SQL and No SQL database management systems, namely Cassandra, CouchDB, MongoDB, PostgreSQL, and RethinkDB; and soft real-time application namely, voice-driven web. Scheduling algorithms for resource allocation for hard real-time applications on virtual processor are also investigated.

Methods: Experiment is used to measure the performance of SQL and No SQL management systems on cluster. It is also used to develop a prototype and predicts processor performance of voice-driven web on multiprocessors. Theoretical methods are used to model and design algorithms to schedule real-time applications on the virtual processor machine. Simulation is used to quantify the performance implications of certain parameter values in our theoretical results and to compare expected performance with theoretical bounds in our schedulability tests.

Results:The performance of Cassandra, CouchDB, MongoDB, 2

PostgreSQL, and RethinkDB is evaluated in terms of writing and reading throughput and latencies in cluster computing. For reading throughput, all database systems are horizontally scalable as the cluster’s nodes number increases, however, only Cassandra and couchDB exhibit scalability for data writing. The overall evaluation shows that Cassandra has the most writing scalable throughput as the number of nodes increases with a relative low latency, whereas PostgreSQL has the lowest writing latency, and MongoDB has the lowest reading latency.

The architectures’ tradeoffs of voice-driven web show that the voice engine should be installed on the server instead of being on the mobile device, and performance evaluations show that speech engine scales with respect to the number of cores in the multiprocessor with and without hyperthreading.

The thesis presents scheduling techniques for real-time applications that runs in virtual machines which are time sharing the processor. Each virtual machine’s period and execution time that allow real-time applications to meet their deadlines can be defined using these techniques. Simulation results show the impact of the length of different VM periods with respect to overhead. The tradeoffs between resources consumption and period length are also given. Furthermore, a utilization based test for scheduling real-time application on virtual multiprocessor is presented. This test determines if a task set is schedulable or not. If the task set is schedulable the algorithm provides the priority for each task. This algorithm avoids Dhall’s effect, which may cause task sets with even very low utilization to miss deadlines.

Conclusions: The thesis presented the performance evaluation of reading and writing throughput and latencies for SLQ and NoSQL management systems in the cluster computing. The thesis quantifies the tradeoffs of voice-driven web architectures and the performance scalability of the speech engine with respect to number of cores of the multiprocessor. Furthermore, this thesis proposes scheduling algorithms for real-time 3

application with hard deadline on virtual processors, either as a single core processor or as a multicore processor.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2018. , p. 36
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 02
Keywords [en]
SLQ and NoSQL database, Bigdata management systems, Structured and non-structured Database Evaluation, Voice-driven web, Multicore performance prediction, Hard real-time Scheduling, Virtual Multiprocessor Scheduling
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-15758ISBN: 978-91-7295-348-2 (print)OAI: oai:DiVA.org:bth-15758DiVA, id: diva2:1173854
Public defence
2018-09-21, J1650, Blekinge Tekniska Högskola, 371 79 Karlskrona, Karlskrona, 13:00 (English)
Opponent
Supervisors
Funder
Sida - Swedish International Development Cooperation AgencyAvailable from: 2018-01-15 Created: 2018-01-14 Last updated: 2018-06-07Bibliographically approved
List of papers
1. “Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster
Open this publication in new window or tab >>“Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster
2017 (English)In: International Journal of Database Management Systems, ISSN 0975-5705, Vol. 9, no 6, p. 1-24Article in journal (Refereed) Published
Abstract [en]

In  this  study,  we  evaluate  the  performance  of  SQL  and  NoSQL  database  management  systems  namely; Cassandra, CouchDB, MongoDB, PostgreSQL, and RethinkDB.  We use a cluster of  four  nodes to run the database  systems,  with  external  load  generators.The  evaluation  is  conducted  using  data  from  Telenor Sverige,  a  telecommunication  company  that  operates in  Sweden.  The  experiments  are  conducted  using three  datasets  of  different  sizes.The  write  throughput  and  latency  as  well  as  the  read  throughput  and latency are evaluated for four queries; namely distance query, k-nearest neighbour query, range query, and region  query.  For  write  operations  Cassandra  has  the  highest  throughput  when  multiple  nodes  are  used, whereas  PostgreSQL  has  the  lowest  latency  and  the  highest  throughput  for  a  single  node.  For  read operations  MongoDB  has  the  lowest  latency  for  all  queries.  However,  Cassandra  has  the  highest throughput  for  reads.  The  throughput  decreasesas  the  dataset  size  increases  for  both  write  and  read,  for both  sequential  as  well  as  random  order  access.  However,  this  decrease  is  more  significant  for  random read and write. In this study, we present the experience we had with these different database management systems including setup and configuration complexity.

Keywords
Trajectory queries, cluster computing, SQL database
National Category
Engineering and Technology Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:bth-15754 (URN)10.5121/ijdms.2017.9601 (DOI)
Funder
Knowledge Foundation, 20140032
Available from: 2018-01-14 Created: 2018-01-14 Last updated: 2018-01-19Bibliographically approved
2. Evaluation of Voice-driven Web Application Architecture
Open this publication in new window or tab >>Evaluation of Voice-driven Web Application Architecture
2012 (English)Conference paper, Published paper (Refereed) Published
Abstract [en]

This paper quantifies the implications and trade-offs of three different architectures for voice driven web application, architectures are implemented as prototypes. The prototypes differ from each other by either using recording, or Text To Speech (TTS) as server based, or TTS as client based to process output speech. A typical application used in this paper, is the most dynamic weather information source which is presented as web feeds or Really Simple Syndication (RSS) feeds. The evaluated quality attributes are performance, maintainability, and development effort. The empirical results show that, each system's architecture has a different quality profile, for instance, one architecture has the lowest development time but the highest maintainability cost, and another has the lowest bandwidth requirements but the highest development cost. Finally, suggestions about optimal choice of system architecture according to the quality requirements of the final system are drawn.

Place, publisher, year, edition, pages
Sorrento: IEEE, 2012
Keywords
Voice based web, IVR application, Web voice quality attributes, Voice driven web evaluation, voice driven architecture
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-6949 (URN)10.1109/SITIS.2012.86 (DOI)000315360300079 ()oai:bth.se:forskinfoDB947E19E0BF5EC4C1257B9B0026A943 (Local ID)978-0-7695-4911-8 (ISBN)oai:bth.se:forskinfoDB947E19E0BF5EC4C1257B9B0026A943 (Archive number)oai:bth.se:forskinfoDB947E19E0BF5EC4C1257B9B0026A943 (OAI)
Conference
8th International Conference on Signal Image Technology and Internet Based Systems (SITIS)
Available from: 2013-07-01 Created: 2013-07-01 Last updated: 2018-01-15Bibliographically approved
3. Performance evaluation and prediction of open source speech engine on multicore processors
Open this publication in new window or tab >>Performance evaluation and prediction of open source speech engine on multicore processors
2013 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper quantifies the performance of the core part of voice driven web using free and open source speech engine; the speech engine which is very high computation demanding, it consists of Automatic Speech Recognition (ASR) and Text To Speech (TTS). Two open source programs, Sphinx-4 and FreeTTS-1.2.2 are used for ASR and TTS respectively. These two programs are executed on 2 different hardware multicore processors with 4 hyperthreaded cores, and 8 cores respectively. The response time with respect to the load variance and the number of cores is measured and predicted using a linear regression model. The results show that, the response time is linear with respect to the input length, this property can be used to directly predict the response for any input length. Moreover, though the response time and the speed up increases as the number of cores increases, the regression coefficients and number of threads reveal that ASR benefits from multicore. The speedup factor for ASR is 1.56 for 8 cores. However for FreeTTS, though being sequential the speed up from the program itself is insignificant, there is about 1. 43 speedup for 8 cores, that comes from the system's contribution. Our findings show that the generalization of the results for multicore processor does not apply to hyperthreading. This paper presents the investigation that is useful for educators, researchers, and applications' developer in voice based applications 'domain.

Place, publisher, year, edition, pages
Luxembourg: ACM, 2013
Keywords
linear regression, multicore performance, open source, performance prediction, speech recognition, text to speech, voice driven web
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-6735 (URN)10.1145/2536146.2536184 (DOI)9781450320047 (ISBN)
Conference
5th International Conference on Management of Emergent Digital EcoSystems (MEDES),Luxembourg
Available from: 2014-04-14 Created: 2014-04-14 Last updated: 2018-01-15Bibliographically approved
4. Period assignment in real-time scheduling of multiple virtual machines
Open this publication in new window or tab >>Period assignment in real-time scheduling of multiple virtual machines
2015 (English)In: Proceedings of the 7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, Association for Computing Machinery (ACM), 2015, p. 180-187Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2015
Keywords
Virtualization; Real-time scheduling; Hard- deadlines; Virtual Machine scheduling; VM period assignment.
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-11801 (URN)10.1145/2857218.2857262 (DOI)
Conference
7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, Sao Paulo, Brazil
Available from: 2016-04-12 Created: 2016-04-12 Last updated: 2018-05-22Bibliographically approved
5. “Real-time scheduling of multiple virtual machines
Open this publication in new window or tab >>“Real-time scheduling of multiple virtual machines
2017 (English)In: International journal of Computers and their applications, Vol. 24, no 3, p. 91-109Article in journal (Refereed) Published
Abstract [en]

    The use of virtualized systems is growing, and one would like to   benefit from   this   kind   of   systems   also   for   real-time applications  with  hard  deadlines.    There  are  two  levels  of  scheduling  in  real-time  applications  executing  in  a  virtualized  environment: traditional real-time scheduling of the tasks in the real-time  application  inside  a  Virtual  Machine  (VM),  and  scheduling   of   different   VMs   on   the   hypervisor   level.   Traditional real-time scheduling uses methods based on periods, deadlines and worst-case execution times of the real-time tasks.In   order   to   apply   the   existing   theory   also   to   virtualized   environments   we   must   obtain   periods   and   (worst-case) execution times for VMs containing real-time applications.   In this paper, we describe a technique for calculating periods and execution  times  and  utilization  for  VMs  containing  real-time applications with hard deadlines.  We show that when we look at  all  VMs  that  share  a  physical  processor  we  are  able  to  use  longer  (better)  periods.   Alternatively,  if  the  periods  are  the  same,  we  are  able  to  use  a  smaller  amount  of  the  processor  resource  for  the  VMs  and  more  tasks  become  schedulable  compared to when we look at each  VM  in  isolation.   We  also  introduce an overhead model that makes it possible to find VM periods that minimize the processor utilization.

Keywords
Real-time virtual machine;real-time scheduling hard deadlines; VM overhead; VM period
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:bth-15755 (URN)
Funder
Knowledge Foundation, 20140032
Available from: 2018-01-14 Created: 2018-01-14 Last updated: 2018-01-19Bibliographically approved
6. A Utilization-based Schedulability Test of Real-time Systems Running on a Multiprocessor Virtual Machine
Open this publication in new window or tab >>A Utilization-based Schedulability Test of Real-time Systems Running on a Multiprocessor Virtual Machine
(English)In: The Computer JournalArticle in journal (Refereed) Submitted
Abstract [en]

Virtualization  makes it possible  to  run  multiple  operating  systems  and  applications  on  the  same  physical  hardware  at  the same timeusing Virtual Machines (VMs). Real-time applications with hard deadlineswould also like tobenefit from using VMs. The underlying physical infrastructure usually contains many cores.  In this paper, we consider a hard real-time applicationthat executeson a VM with multiple virtual cores. Tasks are scheduled globally on the multiprocessor VMusing fixed-priority preemptive scheduling. This means that a task can execute on different virtual coresat different instances in time. In order to avoid Dhall’s effect, which may cause task sets with even very low utilization to miss deadlines, we classify tasks into two priority classes, namely heavy and light tasks. Heavy tasks have higher priority than light tasks. For light tasks we use rate monotonic priority assignment. In this paper we propose a utilization-based test that shows if a task set is schedulable or not. If the task set is schedulable, the test also provides an assignment of priorities to tasks. The input to the test is the taskset, the number of cores (processors)in the VM, the period for the multiprocessor VM,the VM’s deadline, the execution time, and the blocking time when theVMdoes not have access to the underlying hardware in each  period.This  work  generalizes  previous  workby  introducing the  VM’sdeadline as  a  parameter.  We  validate  our  study  by simulation, the results show that the priority assignment used by our algorithm scheduleshigher number of task setsthanthose using rate monotonic (RM)priority assignment.

Keywords
Hard real-time scheduling; Real-time system; Multiprocessor utilization based schedulability test; Virtual multiprocessorscheduling; Global fixed priority scheduling, VM deadline
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:bth-15756 (URN)
Funder
Knowledge Foundation, 20140032
Available from: 2018-01-14 Created: 2018-01-14 Last updated: 2018-01-19Bibliographically approved

Open Access in DiVA

fulltext(501 kB)70 downloads
File information
File name FULLTEXT01.pdfFile size 501 kBChecksum SHA-512
a00849557bd8de53d9e0bf20a16910d72754a509346e091392e97923400dcfff152109c5f0b0cb1ea6e772dbf022e5191e733adc85e0618e47eff4a0ae6b526a
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Niyizamwiyitira, Christine
By organisation
Department of Computer Science and Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 70 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2376 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf