Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
YARN is the resource management framework for Hadoop, and is, in many senses, the modern operating system for the data center. YARN clusters are running at organizations such as Yahoo!, Spotify, and Twitter with clusters of up to 3500 nodes being reported in the literature. To harness the power of so many nodes and manage them efficiently YARN is required to fulfill the requirements like scalability, serviceability, multitenancy, reliability, high cluster utilization, secure and auditable operation. Currently, YARN supports three different schedulers for prioritizing the allocation of resources (CPU, memory) to applications. Existing schedulers have a broken incentive model for popular frameworks like Apache Spark and Apache Flink where applications have gang-scheduling semantics, that is, they need all nodes to be available before they can start work. Users are incentivized to launch and hog their resources, as there may be a substantial delay (in Spotify, up to 1 hour) in getting 100 or more nodes allocated to your application. Users are not penalized for hogging resources. Capacity scheduler is one of the schedulers that has been used as a default scheduler in YARN which is quite good in sharing resources among tenants with a degree of guaranteed resource availability. Still there is room for improvements. In this thesis, we propose the design and implementation of a new system called Quota-based access control system that will work as a layer over capacity scheduler for Hops-YARN, a project developed on Apache YARN. Quota-based access control system involves allocating a quota of resources to projects.
A project consists of a number of users who manage a number of data sets and is taken from a new frontend for Hadoop called HopsWorks, (www.hops.io). Project members can spend part of their quota to launch and run applications. In contrast to existing schedulers, our control system will incentivize users for not launching unnecessary applications or hog resources. In this work we also have analyzed the operational model of the scheduler including Quota-based access control system with different application scheduling scenarios. We also have investigated the failure scenarios which includes network partition and failure of different components of YARN and analyzed the consequence of the failure on the scheduling operation. Finally, we have proposed some future improvements for this scheduling system.
2016. , 71 p.