Safety-critical systems are those in which failure can lead to loss of people’s lives, or catastrophic damage to the environment. Timeliness is an important requirement in safety-critical systems, which relates to the notion of response time, i.e., the time a system takes to respond to stimuli from the environment. If the response time exceeds a specified time interval, a catastrophe might occur.
Stringent timing requirements make testing a necessary and important process with which not only the correct system functionality has to be verified but also the system timing behaviour. However, a key issue for testers is to determine when to stop testing, as stopping too early may result in defects remaining in the system, or a catastrophe due to high severity level of undiscovered defects; and stopping too late will result in waste of time and resources. To date, researchers and practitioners have mainly focused on the design and application of diverse testing strategies, leaving the critical stop-test decision a largely open issue, especially with respect to timeliness.
In the first part of this thesis, we propose a novel approach to make a stop-test decision in the context of testing the worst-case timing characteristics of systems. More specifically, we propose a convergence algorithm that informs the tester whether further testing would reveal significant new insight into the timing behaviour of the system, and if not, it suggests testing to be stopped. The convergence algorithm looks into the observed response times achieved by testing, and examines whether the Maximum Observed Response Time (MORT) has recently increased, and when this is no longer the case, it investigates if the distribution of response times has changed significantly. When no significant new information about the system is revealed during a given period of time it is concluded, with some statistical confidence, that more testing of the same nature is not going to be useful. However, some other testing techniques may still achieve significant new findings.
Furthermore, the convergence algorithm is evaluated based on the As Low As Reasonably Practicable (ALARP) principle which is an underpinning concept in most safety standards. ALARP involves weighting benefit against the associated cost. In order to evaluate the convergence algorithm, it is shown that the sacrifice, here testing time, would be grossly disproportionate compared to the benefit attained, which in this context is any further significant increase in the MORT after stopping the test.
Our algorithm includes a set of tunable parameters. The second part of this work is to improve the algorithm performance and scalability through the following steps: firstly, it is determined whether the parameters do affect the algorithm. Secondly, the most influential parameters are identified and tuned. This process is based on the Design of Experiment (DoE) approach.
Moreover, the algorithm is required to be robust, which in this context is defined “the algorithm provides valid stop-test decisions across a required range of task sets”. For example, if the system’s number of tasks varies from 10 to 50 tasks and the tasks’ periods change from the range [200 μ s, 400 μ s] to the range [200 μ s, 1000 μ s], the algorithm performance would not be adversely affected. In order to achieve robustness, firstly, the most influential task set parameters on the algorithm performance are identified by the Analysis of Variance (ANOVA) approach. Secondly, it is examined whether the algorithm is sound over some required ranges of those parameters, and if not, the situations in which the algorithm’s performance significantly degrades are identified. Then, these situations will be used in our future work to stress test the algorithm and to tune it so that it becomes robust across the required ranges.
Finally, the convergence algorithm was shown to be successful while being applied on task sets having similar characteristics. However, we observe some experiments in which the algorithm could not suggest a proper stop-test decision in compliance to the ALARP principle, e.g., it stops sooner than expected. Therefore, we examine whether the algorithm itself can be further improved focusing on the statistical test it uses and if another test would perform better.
Västerås: Mälardalen University , 2016.
Punnekat, Sasikumar, ProfessorBate, Iain, Visiting professorHansson, Hans, Professor