FUSE:Lightweight Guaranteed Distributed Failure Notification
2004 (English)In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI), Association for Computing Machinery (ACM), 2004Conference paper (Refereed)
FUSE is a lightweight failure notification service for building distributed systems. Distributed systems built with FUSE are guaranteed that failure notifications never fail. Whenever a failure notification is triggered, all live members of the FUSE group will hear a notification within a bounded period of time, irrespective of node or communication failures. In contrast to previous work on failure detection, the responsibility for deciding that afailure has occurred is shared between the FUSE service and the distributed application. This allows applications to implement their own definitions of failure. Our experience building a scalable distributed event delivery system on an overlay network has convinced us of the usefulness of this service. Our results demonstrate that the network costs of each FUSE group can be small; in particular, our overlay network implementation requires no additional liveness-verifying ping traffic beyond that already needed to maintain the overlay, making the steady state network load independent of the number of active FUSE groups.
Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2004.
IdentifiersURN: urn:nbn:se:kth:diva-147143OAI: oai:DiVA.org:kth-147143DiVA: diva2:727743
The 6th Symposium on Operating Systems Design and Implementation (OSDI),December 6-8, 2004, San Fransisco, USA
QC 201407042014-06-232014-06-232014-07-04Bibliographically approved