Journal article Open Access
Nishant Saurabh; Radu Prodan
Infrastructure-as-a-service (IaaS) Clouds concurrently accommodate diverse sets of user requests, requiring an efficient strategy for storing and retrieving virtual machine images (VMIs) at a large scale. The VMI storage management requires dealing with multiple VMIs, typically in the magnitude of gigabytes, which entails VMI sprawl issues hindering the elastic resource management and provisioning. Unfortunately, existing techniques to facilitate VMI management overlook VMI semantics (i.e at the level of base image and software packages), with either restricted possibility to identify and extract reusable functionalities or with higher VMI publishing and retrieval overheads. In this paper, we propose Expelliarmus, a novel VMI management system that helps to minimize VMI storage, publishing and retrieval overheads. To achieve this goal, Expelliarmus incorporates three complementary features. First, it models VMIs as semantic graphs to facilitate their similarity computation. Second, it provides a semantically-aware VMI decomposition and base image selection to extract and store non-redundant base image and software packages. Third, it assembles VMIs based on the required software packages upon user request. We evaluate Expelliarmus through a representative set of synthetic Cloud VMIs on a real test-bed. Experimental results show that our semantic-centric approach is able to optimize the repository size by times compared to state-of-the-art systems (e.g. IBM’s Mirage and Hemera) with significant VMI publishing and slight retrieval performance improvement.