Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published August 22, 2016 | Version v1
Report Open

Microservices Scheduling for ALICE O2 Facility

  • 1. CERN openlab Summer Student
  • 2. Summer Student Supervisor

Description

Project Specification

This project seeks to research ways to deploy DDS (Dynamic Deployment System) jobs across cluster nodes using Apache Mesos. It is subdivided in three subtasks

1. DDS Mesos Plugin – This task involves writing a plugin for DDS such that it can deploy agents on cluster node by using Apache Mesos. The DDS plugin interfaces with Mesos using the Mesos Framework API.

2. Trying out Mantl – This task involves deploying Mantl.io on a test cluster, try Mantl.io and analyse the complexities of deploying DDS agents using the Mantl GUI interface. A summary of the advantages and disadvantages of deploying DDS agents through Mantl is to be reported.

3. Automatic Network Topology Detection – This task involves researching ways to automatically infer the underlying network switch topology in Layer 2. This means that protocols for network discovery such as SNMP cannot be used. After this step, one idea is to use the discovered topology in Mesos. 

As a result, one will be able to submit DDS jobs on a Mesos controlled cluster through the usual dds-submit interface/procedure which is currently in use.

Abstract

As academic and industrial computational needs rise, organisations employ the use of computer clusters in order to keep up with these computational needs and CERN is no exception. Several distributed software frameworks exist, each of which solves a particular problem. However, these frameworks assume total control of cluster resources making it difficult to run them concurrently on the same cluster. Additionally, it is apparent that there is no scheduling algorithm or policy that satisfies all types of jobs. Apache Mesos, a meta-scheduler for distributed systems, tries to mitigate this problem without resorting to statically partitioning a cluster. In this work we have explored ways of integrating the Dynamic Deployment System (DDS) at CERN with Mesos. As a result, DDS jobs can be run on a Mesos governed cluster.

Files

SummerStudentReport-KevinNapoli-2.pdf

Files (1.5 MB)

Name Size Download all
md5:d67b57747860321c981c75f52551508b
1.5 MB Preview Download