Using execution trace data to improve distributed systems
2002 (English)In: Software, practice & experience, ISSN 0038-0644, E-ISSN 1097-024X, Vol. 32, no 9, 889-906 p.Article in journal (Refereed) Published
One of the most challenging problems facing today's software engineer is to understand and modify distributed systems. One reason is that in actual use systems frequently behave differently than the developer intended. In order to cope with this challenge, we have developed a three-step method to study the run-time behavior of a distributed system. First, remote procedure calls are traced using CORBA interceptors. Next, the trace data is parsed to construct RPC call-return sequences, and summary statistics are generated. Finally, a visualization tool is used to study the statistics and look for anomalous behavior. We have been using this method on a large distributed system (more than 500000 lines of code) with data collected during both system testing and operation at a customer's site. Despite the fact that the distributed system had been in operation for over three years, the method has uncovered system configuration and efficiency problems. Using these discoveries, the system support group has been able to improve product performance and their own product maintenance procedures.
Place, publisher, year, edition, pages
2002. Vol. 32, no 9, 889-906 p.
IdentifiersURN: urn:nbn:se:ltu:diva-15107DOI: 10.1002/spe.466Local ID: e9187a30-12bc-11dd-ada4-000ea68e967bOAI: oai:DiVA.org:ltu-15107DiVA: diva2:988080
Validerad; 2002; 20080425 (ysko)2016-09-292016-09-29Bibliographically approved