Report Open Access
Medjkoune, Nawel; Jones, Ben Dylan
The CERN IT Batch service provides computing resources for users at CERN and for the Worldwide LHC Computing Grid. Comprising around 80,000 CPU cores, a large percentage - currently 15,000 - are dedicated to our future batch service powered by HTCondor (https://research.cs.wisc.edu/htcondor/). The new system brings some new features, including the ability to use Docker (https://www.Docker.com) to run jobs in the batch service. Linux container technology such as Docker has become very popular with developers due to ease of packaging and deploying applications. HTCondor has a dedicated “universe” for Docker, and so providing this as part of the batch services should enhance the user experience, and indeed the ease of administering the batch service. The project is to provide and test Docker support in the HTCondor service, and to pilot use cases with advanced user communities such as the ATLAS experiment.
Today, the CERN batch system is a facility with more than 10,000 compute nodes. The CERN IT batch service manages this facility and provides computing resources to CERN users and for the Worldwide LHC Computing Grid. The computing power the batch system manages is around 100,000 CPU cores.
In the past, the CERN batch system was running the IBM platform LSF. Currently part of the batch farm has migrated to HTCondor, the future workload management software for the CERN batch system.
Latest versions of HTCondor come with new features, including the ability to run jobs inside Docker containers. We would like to test the Docker support in the batch service to address two issues:
● The inability of the batch submitters to define their own environment of execution, without the intervention of a system administrator
● The underlying OS on the worker nodes are dependent of the job environment. Making updates on those OS may affect the software compatibilities.
In this report, we outline and document the deployment of Docker on the HTCondor worker nodes running CERN CentOS 7, the setting up of the Docker Universe and the creation of job routes that transform incoming jobs in the grid to Docker jobs to be executed in containers. The project includes also the subtask of creating a Scientific Linux CERN 6 (SLC6) Docker image for the grid jobs.