Accelerated Approximation for Stochastic Reachability Games: Extended version of paper New algorithms for solving simple stochastic games
2010 (English)Other (Other academic)
In this paper new algorithms for finding optimal values and strategies inturn-based stochastic games with reachability objectives are presented,whose special case are the simple stochastic games considered elsewhere [4,11]. The general idea of these algorithms is to accelerate the successive approximation scheme for solving stochastic games  in which node values are updated in each iteration so that they converge to the optimal values of the game. This scheme is extended with a pair of positional strategies which are updated to remain greedy with respect to the current approximation. This way optimal strategies can be discovered before the current values get close to the optimal ones. The approximation process is accelerated, by predicting an approximate result of several updates of the current valuation and jumping directly to the predicted values.
New algorithms are based on three different acceleration techniques: iterative squaring, linear extrapolation, and linear programming; with different difficulty of performing single iteration and different acceleration level achieved by each of them. For each of these algorithms the complexity of a single iteration is polynomial. It is shown that accelerated algorithms will never perform worse than the basic, non-accelerated one and exponential upper bounds on the number of iterations required to solve a game in the worst case is given. It is also proven that new algorithms increase the frequency with which the greedy strategies are updated. The more often strategies are updated, the higher chances that the algorithm will terminate early. It is proven that the algorithm based on linear programming updates the greedy strategies in every iteration, which makes it similar to the strategy improvement method, where also a new strategy is found in each iteration and this also at the cost of solving linear constraint problems
Paper is concluded with presentation of results of experiments in which new algorithms were run on a sample of randomly generated games. It could be observed that the proposed acceleration techniques reduce the number of iterations of the basic algorithm by an order of magnitude and that they substantially increase frequency with which the greedy strategies are updated. The algorithms based on linear programming and linear extrapolation displayed similar efficiency as the ones based on strategy improvement. This makes the algorithm based on linear extrapolation especially attractive because it uses much simpler computations than linear constraint solving.
Place, publisher, year, edition, pages
simple stochastic games, successive approximation, strategy improvement, complexity
Research subject Computer Science
IdentifiersURN: urn:nbn:se:uu:diva-179845OAI: oai:DiVA.org:uu-179845DiVA: diva2:546649
The original paper was published in Proceedings of the Workshop on Games in Design and Verification (GDV 2004), volume 119 of Electronic Notes in Theoretical Computer Science, pages 51–65. Elsevier. http://dx.doi.org/10.1016/j.entcs.2004.07.0082012-08-242012-08-242013-01-22Bibliographically approved