Elastic Private Cloud Management Framework (SLIM-VM)

Cloud computing technology has become the trend in both academic and IT industry in recent years. There are two types of cloud: public cloud and private cloud. A public cloud provides the services for anyone through internet; this implies users need to store their data at the remote ends rather than storing them locally. However, for security reasons, a lot of organizations and enterprises choose to build their private clouds to host their data and applications for internal use only. Data center administrators often face the situations that many user parties request for cluster use while there are insufficient physical machines to entertain them. On the other hand, some powerful server nodes may see underutilization if allocated to only one user at a time. Instead of allocating at coarse-grained per-node basis, we built an Elastic Private Cloud Management Framework to greatly simplify the management work. Using such a framework, we can satisfy the requests from several parties simultaneously while also improving the utilization of the cluster. The system provides a centralized way for the administrator to manage multiple virtual clusters (v-clusters) sharing the same underlying physical resources. Due to good isolation of virtualization technology, the end users are not aware that the machines are indeed being shared with others. The system is of several capabilities that simplify the management work:


SLIM  Core Components

  • Centralized and application-centric software management. All OS images and grid middleware are stored and managed in a central InstantGrid server. These software components are grouped into distinct, pre-defined, EEs; each EE targets at a specific type of applications. For example, service-oriented distributed applications and job submission-based HPC rely on two very different EEs. This model guarantees well-defined EEs for (and hence compatibility with) various grid applications. The centrally managed EEs are disseminated to the compute nodes on-demand through the network, according to the application requirements.

  • Proactive software configuration. Instead of installing and configuring OSes and middleware incrementally after they are disseminated to the compute nodes, all software components in a specific EE are required to be pre-configured in the InstantGrid server. In other words, software would not be disseminated to the compute nodes unless all of them are ready to be executed to form the desired EE. These approaches shorten the time in composing and switching between EEs.

  • Performance optimization techniques. The centralized management model implies an entire EE (which could be as large as a few gigabytes) has to be disseminated to the compute nodes on-demand. While replicating all files is obviously impractical, the existing network booting approaches which completely rely on the network file system (NFS) would result in poor runtime performance. We aim to address this problem by exploiting efficient I/O caching techniques to avoid excessive file transfer. In addition, the discriminative file sharing mechanisms select the suitable strategy (e.g., NFS-shared, replication, etc.) according to the usage pattern of a file, which optimizes both the dissemination and runtime performance.

  • In-memory execution mode. We aim to cater for a scenario in which the data/OS stored in the permanent storage in the compute nodes would not be altered (or even accessed) when an EE obtained via the network executes, i.e., a complete in-memory operation. This is especially useful for supporting grid computing in existing cluster platforms, desktop/home computers, and diskless blade servers.


SLIM Working Mechanism

1. Software installation at SLIM server

2. Client boots and obtains kernel



3. OS image/App disseminated

4. Process to generate certificates


With Single Linux Image Management (SLIM), administrators can boot a large number of physical machines with customized OS and execution environment. "One-Click Virtual Clusters" feature allows you to create a v-cluster instantly. The system guarantees performance isolation between different v-clusters running on the same physical host via adavnced isolation techniques. With live migration of virtual machines, we can easily achieve dynamic load balancing and consolidation to maximally utilize physical resources.




GUI for VM Management


  • Roy S.C. Ho, K.K. Yin, David C.M. Lee, Daniel H.F. Hung, C.L. Wang, and Francis C.M. Lau, ``InstantGrid: A Framework for On-Demand Grid Point Construction,''  The International Workshop on Grid and Cooperative Computing (GCC 2004), pp. 911-914, Oct 21-24, 2004, Wuhan, China. (pdf)
  • Roy S.C. Ho, David C.M. Lee, Daniel H.F. Hung, Cho-Li Wang, and Francis C.M. Lau, "On Managing Execution Environments for Utility Computing,''  Proceedings of Network Research Workshop 2004, 18th Asia-Pacific Advanced Network Meetings (APAN 2004), Cairns, Australia, July 6, 2004, 175-182. (pdf)

Copyright HKU CS Department 2009-2010