This master’s thesis addresses scaling of content distribution sites. In a case study, the thesis investigates issues encountered on ftp.acc.umu.se, a content distribution site run by the Academic Computer Club (ACC) of Umeå University. This site is characterized by the unusual situation of the external network connectivity having higher bandwidth than the components of the system, which differs from the norm of the external connectivity being the limiting factor. To address this imbalance, a caching approach is proposed to architect a system that is able to fully utilize the available network capacity, while still providing a homogeneous resource to the end user. A set of modifications are made to standard open source solutions to make caching perform as required, and results from production deployment of the system are evaluated. In addition, time series analysis and forecasting techniques are introduced as tools to improve the system further, resulting in the implementation of a method to automatically detect bursts and handle load distribution of unusually popular files.
The High Performance Computing Center North (HPC2N) Super Cluster is a truly self-made high-performance Linux cluster with 240 AMD processors in 120 dual nodes, interconnected with a high-bandwidth, low-latency SCI network. This contribution describes the hardware selected for the system, the work needed to build it, important software issues and an extensive performance analysis. The performance is evaluated using a number of state-of-the-art benchmarks and software, including STREAM, Pallas MPI, the Atlas DGEMM, High-Performance Linpack and NAS Parallel benchmarks. Using these benchmarks we first determine the raw memory bandwidth and network characteristics; the practical peak performance of a single CPU, a single dual-node and the complete 240-processor system; and investigate the parallel performance for non-optimized dusty-deck Fortran applications. In summary, this $500 000 system is extremely cost-effective and shows the performance one would expect of a large-scale supercomputing system with distributed memory architecture. According to the TOP500 list of June 2002, this cluster was the 94th fastest computer in the world. It is now fully operational and stable as the main computing facility at HPC2N. The system’s utilization figures exceed 90%, i.e. all 240 processors are on average utilized over 90% of the time, 24 hours a day, seven days a week.
In 2007 HPC2N at Umeå University built a new computer room for air cooled high density computing (40+ kW/rack), and presented this over several HEPiX meetings.
This is a follow-up talk where we go over the design, how it turned out in practice, upgrades, experiences over 15 years in production, thoughts for the future, etc.