Predicting Service Metrics from Device and Network Statistics
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
For an IT company that provides a service over the Internet like Facebook or Spotify, it is very important to provide a high quality of service; however, predicting the quality of service is generally a hard task. The goal of this thesis is to investigate whether an approach that makes use of statistical learning to predict the quality of service can obtain accurate predictions for a Voldemort key-value store  in presence of dynamic load patterns and network statistics. The approach follows the idea that the service-level metrics associated with the quality of service can be estimated from serverside statistical observations, like device and network statistics. The advantage of the approach analysed in this thesis is that it can virtually work with any kind of service, since it is based only on device and network statistics, which are unaware of the type of service provided.
The approach is structured as follows. During the service operations, a large amount of device statistics from the Linux kernel of the operating system (e.g. cpu usage level, disk activity, interrupts rate) and some basic end-to-end network statistics (e.g. average round-trip-time, packet loss rate) are periodically collected on the service platform. At the same time,
some service-level metrics (e.g. average reading time, average writing time, etc.) are collected on the client machine as indicators of the store’s quality of service. To emulate network statistics, such as dynamic delay and packet loss, all the traffic is redirected to flow through a network emulator. Then, different types of statistical learning methods, based on linear and tree-based regression algorithms, are applied to the data collections to obtain a learning model able to accurately predict the service-level metrics from the device and network statistics.
The results, obtained for different traffic scenarios and configurations, show that the thesis’ approach can find learning models that can accurately predict the service-level metrics for a single-node store with error rates lower than 20% (NMAE), even in presence of network impairments.
Place, publisher, year, edition, pages
Quality of service, machine learning, network statistics, key-value store, Voldemort
Communication Systems Computer Systems
IdentifiersURN: urn:nbn:se:kth:diva-175892OAI: oai:DiVA.org:kth-175892DiVA: diva2:864158
2015-10-02, SICS, Kista, Stockholm, 13:30