This dissertation presents algorithms for data structures called distributed hash tables (DHT) or structured overlay networks, which are used to build scalable self-managing distributed systems. The provided algorithms guarantee lookup consistency in the presence of dynamism: they guarantee consistent lookup results in the presence of nodes joining and leaving. Similarly, the algorithms guarantee that routing never fails while nodes join and leave. Previous algorithms for lookup consistency either suffer from starvation, do not work in the presence of failures, or lack proof of correctness. Several group communication algorithms for structured overlay networks are presented. We provide an overlay broadcast algorithm, which unlike previous algorithms avoids redundant messages, reaching all nodes in O(log n) time, while using O(n) messages, where n is the number of nodes in the system. The broadcast algorithm is used to build overlay multicast. We introduce bulk operation, which enables a node to efficiently make multiple lookups or send a message to all nodes in a specified set of identifiers. The algorithm ensures that all specified nodes are reached in O(log n) time, sending maximum O(log n) messages per node, regardless of the input size of the bulk operation. Moreover, the algorithm avoids sending redundant messages. Previous approaches required multiple lookups, which consume more messages and can render the initiator a bottleneck. Our algorithms are used in DHT-based storage systems, where nodes can do thousands of lookups to fetch large files. We use the bulk operation algorithm to construct a pseudo-reliable broadcast algorithm. Bulk operations can also be used to implement efficient range queries. Finally, we describe a novel way to place replicas in a DHT, called symmetric replication, that enables parallel recursive lookups. Parallel lookups are known to reduce latencies. However, costly iterative lookups have previously been used to do parallel lookups. Moreover, joins or leaves only require exchanging O(1) messages, while other schemes require at least log(f) messages for a replication degree of f. The algorithms have been implemented in a middleware called the Distributed k-ary System (DKS), which is briefly described.
2006, 1. , 209 p.
Updated version of the original dissertation. SICS Dissertation: SICS-D-45