Heuristics for Efficient Resource Allocation in Cloud Computing

The resource allocation in cloud computing determines the allocation of computer and network resources of service providers to service requests of users for meeting user service requirements. It is not scalable to solve the resource allocation problem as an optimization problem to obtain the optimal solution in real time. This paper presents the development and testing of heuristics for the efficient resource allocation to obtain near-optimal solutions in a scalable manner. We first define the resource allocation problem as a Mixed Integer rogramming (MIP) optimization problem and obtain the optimal solutions for various resourceservice problem types. Based on the analysis of the optimal solutions, we design heuristics for the efficient resource allocation. Then we evaluate the performance of the resource allocation heuristics using various resource-service problem types and different numbers of service requests and resources. The results show the comparable performance of the heuristics to the optimal solutions. The resource allocation heuristics also demonstrate the better computational efficiency and thus scalability than solving the MIP problems to obtain the optimal solutions.


Introduction
A cloud is defined as a large pool of virtualized resources such as hardware, platforms, or services that are built on a distributed infrastructure of physical resources [1,2]. Resource management in cloud computing is a challenging task as it has to satisfy objectives of Cloud Computing Service Providers (CCSP) and cloud users in an unpredictable environment due to the fluctuating workload and large shared resources [3]. Resource allocation aims to ensure that the requested services' requirements are met by CCSP's resources. The success of cloud computing depends upon the allocation of resources to requested services in an efficient and effective way [4]. Centralized methods of resource allocation require a central entity to either solve the resource allocation problem directly or coordinate a solution of the resource allocation problem with information of all service requests' requirements and resource states of all Page: 2 www.raftpubs.com service providers. However, centralized methods face the challenge of scalability to solve a large-scale resource allocation problem and obtain the optimal solution in real time [5]. Decentralized methods rely on interactions among service providers and between service providers and end users to seek the solution of a resource allocation problem, without any central entity. Using such self-management principles in a decentralized manner does not necessarily guarantee the optimal or nearoptimal solutions for resource allocation. This paper focuses on centralized methods, especially heuristics to obtain the computational efficiency as well as optimal or near optimal solutions of resource allocation.
Heuristics for centralized resource allocation with computational scalability have not been well addressed in existing work on cloud computing. In comparison with the reviewed work as summarized in Table 1, this study considers the satisfaction of QoS and resource requirements stated in Service Level Agreements (SLA) when determining the allocation of computer and network resources to service requests, rather than allowing SLA violations. To ensure the satisfaction of all service requests' requirements, this study introduces the use of system models [6] for the precise resource allocation. System models capture how services impose loads on resources and change the state of resources and the system performance. The heuristics in this study deal with the insufficient resource capacity by generating solutions with some service requests dropped as in the optimal solutions when the overall resource capacity is not sufficient to satisfy all service requests, rather than outsourcing the overloaded requests to external clouds. Moreover, the heuristic solutions in this study are generated with scalability. The performance of the heuristics is compared to the optimal solutions in this study.  [7][8][9][10] YES NO NO YES NO [11][12][13][14][15][16] NO NO NO NO NO [17] YES NO YES YES YES [18] YES NO NO YES NO [19] YES NO YES YES NO [20][21][22][23][24][25][26] YES NO NO NO NO [27,28] YES NO NO NO YES [29] NO YES YES NO NO [30] YES YES YES YES NO Section 2 provides the Mixed Integer Programming (MIP) formulation of the optimization problem for resource allocation. The optimal solutions of the optimization problem for various resource-service problem types are examined to gain insights and develop heuristics for the efficient resource allocation. Section 3 describes various resource-service problem cases and different numbers of service requests and resources used to test and evaluate the performance of the heuristics. Section 3 also introduces performance measures and presents the comparison of the heuristic solutions with the optimal solutions. Section 4 concludes the paper.

Heuristics of centralized resource allocation
In this section, we first present the MIP formulation of a resource allocation problem. We then analyze the optimal solutions of various problem cases to develop heuristics that capture the essence of decisions made in the optimal solutions. Finally, we describe the heuristics for various problem types and the implementation of the heuristics.

The formulation of a resource allocation optimization problem
The resource allocation is often addressed as an optimization problem consisting of objectives, decision variables, and constraints. There are mainly three types of optimization objectives in resource allocation: 1) Resource performance objectives such as resource utilization, load balancing, and energy saving by switching on and off servers depending on their workload and resource status [6,9,11,16,17,23,26,[31][32][33][34].
The optimization problem of resource allocation is subject to various types of constraints. Application requirements of resource and QoS are stated in SLA between cloud users and CCSPs covering CPU and memory requirements for host machines, bandwidth, delay, and QoS requirements (e.g., execution time) of services and applications. Capacity limits of resources are given to indicate the maximum capacity of each system resource. System models are provided to describe how services impose resource workloads and thus change the state of system resources which in turn affect the performance of services [41,42]. The predicted behavior and performance of resources, the system and services based on such system models are essential to the precise resource allocation [43].
In the formulation of the optimization problem, decision variables are used to assign requested services to CCSPs and determine the specific level of each service parameter for each requested service that is satisfied by the assigned CCSP. Each type of service is associated with certain service parameters (e.g., a web service is associated with service parameters such as bandwidth and response time (Chen, Farley, & Ye, 2004) that are satisfied at certain levels by system resources on the assigned CCSPs. System models for each CCSP are included in the constraints to predict the resource workloads and service performance for the precise resource allocation. The optimization problem addresses resource, system and application objectives of resource allocation and is solved for each epoch of service requests in the stream of dynamically arriving service requests based on the timevarying state of system resources. The optimization problem of resource allocation is formulated as a MIP problem. The following variables are used in the problem formulation.
• Decision variables KI X = 1 if service request k is assigned to server i 0 if service request k is not assigned to server i where k denotes a given service request, k = 1, …, K; i denotes a given server This input is from the user generating service request k to specify QoS requirements, k = 1, …, K.
• Server-client coordination constraint This constraint requires that service request k can be assigned to one server i at most.
• Service constraints ki ks is This constraint requires if= ki X =1 and KS U =1, then is V =1that is, if service request k is assigned to server i, the service type s of service request k must be provided by server i.
This constraint enforces that the level of service parameter s d of service request k on server i should not exceed the limit (i.e. the maximum level).
• Service-resource-QoS relation constraints  , and then is summed over QoS variables and all service requests. If a requested service is not served, the term inside the two summations in Formula 8 produces the value of 1. If a requested service is not served, the term inside the two summations in Formula 8 produces the value in (0,1). Hence, minimizing the difference between the actual QoS level provided and the required QoS level of each service request in the objective function also makes more service requests to be served.

Problem cases for obtaining optimal solutions
Sets of different problem cases are designed to generate the optimal solutions which are then analyzed to identify heuristics for generating optimal or near-optimal resource allocation solutions in a computationally efficient manner.
In the problem cases, there are requests for two types of services: the communication intensive service (e.g., video streaming) and the computation intensive service (e.g., data encryption). We consider one service parameter for the communication intensive service (e.g., the size of video data) and one service parameter for the computation intensive service (e.g., the key length for data encryption). We also consider one QoS variable for the communication intensive service (e.g., delay) with three levels of Low (L), Medium (M) and High (H) and one QoS variable for the computation intensive service (e.g., completion time) with three levels of L, M and H. If the actual QoS level provided to a service request is equal to or greater than a required QoS level, then the service request is considered satisfied. Table 2 shows three levels of QoS variables and the maximum level of service parameters for the communication intensive service and the computation intensive service used in the three problem sets.
In the problem cases, there are two types of servers: communication-centered server and computationcentered server. Both types of servers have two types of resources: the CPU resource and the network bandwidth resource. On the communication-centered server, the bandwidth resource has more capacity than the CPU resource. On the computation-centered Page: 6 www.raftpubs.com server, the CPU resource has more capacity than the bandwidth resource. Table 3 shows the capacity limits of two resource variables for the CPU resource and the bandwidth resource on the communication-centered server and the computation-centered server, which appear in Formula 6 in the formulation of the MIP optimization problem. Three resource levels of Small (S), Medium (M) and Large (L) are used for the communication-centered server, and three resource levels of S, M and L are used for the computation-centered server.
In Set 1, we have each communication-centered server set its resource capacity limits to one of three levels: 30, 36 and 41 as the capacity limits of the CPU resource for the S, M and L levels, respectively, and 100, 200 and 350 as the capacity limits of the bandwidth resource for the S, M and L levels, respectively. We also have each computation-centered server set its resource capacity limits to one of three levels: 116, 232 and 406 as the capacity limits of the CPU resource for the S, M and L levels, respectively, and 27, 32 and 35 as the capacity limits of the bandwidth resource for the S, M and L levels, respectively. Similarly in Set 2, we have each communication-centered server set its resource capacity limits to one of three levels: 45, 65 and 90 as the capacity limits of the CPU resource for the S, M and L levels, respectively, and 100, 135 and 245 as the capacity limits of the bandwidth resource for the S, M and L levels, respectively. We also have each computation-centered server set its resource capacity limits to one of three levels: 105, 155 and 205 as the capacity limits of the CPU resource for the S, M and L levels, respectively, and 50, 70 and 80 as the capacity limits of the bandwidth resource for the S, M and L levels, respectively.
In Set 3 we randomly select a value in a range as the capacity limits of two resources (the CPU resource and the bandwidth resource) at each of the three capacity levels (S, M and L). For each communication-centered server, a specific value is randomly selected from 24 to 26 as the capacity limit of the CPU resource and from 81 to 84 as the capacity limit of the bandwidth resource for the S level, from 30 to 33 as the capacity limit of CPU resource and from 93 to 95 as the capacity limit of the bandwidth resource for the M level, and from 38 to 40 as the capacity limit of the CPU resource and from 104 to 107 as the capacity limit of the bandwidth resource for the L level. For each computation-centered server, a specific value is randomly selected from 78 to 79 as the capacity limit of the CPU resource and from 25 to 26 as capacity limit of the bandwidth resource for the S level, from 87 to 89 as the capacity limit of the CPU resource and from 29 to 30 as the capacity limit of the bandwidth resource for the M level, and from 95 to 97 as the capacity limit of the CPU resource and from 32 to 33 as the capacity limit of the bandwidth resource for the L level. Note that each server can serve both communication intensive services and computation intensive services.
Page: 7 www.raftpubs.com As in our previous work (Yau et al., 2009) using resource and QoS impact models of various services, this study also uses two variables of resource usages: the CPU resource usage ( 1 ki R and the bandwidth resource usage ( 2 ki R to determine the QoS performance of two services. A communication intensive service requires the more usage of the bandwidth resource than the usage of the CPU resource, while the computation intensive service requires the more usage of the CPU resource than the usage of the bandwidth resource. The resource and QoS impact models of services in Formula 4 and Formula 5 are needed in the MIP optimization formulation. Many forms of models (e.g., linear or nonlinear) can be used in Formula 4 and Formula 5. Without loss of generality, linear models are used in this study (see Table 4). As shown in Table 4, various parameter values in these linear models are used in the three different problem sets such that heuristics identified from analyzing the optimal solutions of the resource allocation problems are not influenced by parameter values. Computation Intensive Service (s = 2) 1 The specific number of service requests for each capacity-request condition of each problem case is determined by looking into resource usages based on F functions of service-resource relations and the corresponding QoS values based on G functions of resource-QoS relations in Table 4 to meet the QoS requirements of all the service requests in Table 2 with the given resource capacity in Table 3. Table 5 gives two sets of the parameters used in the heuristics to allow us examining possible effects of different parameter values on the resource allocation solutions. necessary input files of the problem including the given inputs as well as service-resource relation functions and resource-QoS relation functions for each service type. With the loaded input files, the C# code then calls ILOG OPL and CPLEX to run the MIP optimization and solve the problem to generate an optimal solution. The computation time of obtaining the optimal solution is recorded by the C# code. Note that times required for loading input files and generating output files are also included in the computation time of obtaining the optimal solution.

Analysis of optimal solutions to develop heuristics
The optimal solution to each problem is analyzed to gain insights into the resource allocation decision made in the optimal solution. Based on the analysis and insights gained from the optimal solutions, we develop heuristics for Case A, B and C of resourceservice conditions. For all problems of Case A where each server has the sufficient resource capacity to satisfy all service requests of all clients, the optimal solutions assign most service requests to one server (i.e., the first server with i=1 in the problem formulation). Hence, the following heuristic is identified for Case A as shown in Table 6. The optimal solutions to the problems of Case B, where each server does not have the sufficient resource capacity to satisfy all service requests but the total resource capacity of all servers is sufficient to satisfy all service requests, reveal six different heuristics as shown in Table 7. For Case C problems, not only each server but all servers together do not have the sufficient resource capacity to satisfy all service requests. Table 8 gives the heuristics for Case C problems based on the analysis of their optimal solutions to these problems. Designate one server as the dominant server, and select the dominant server to serve a service request with the probability of α and another server to serve the service request with the probability of (1-α). The parameter, α, takes a value in (0, 1] and is closer to 1 than 0 (e.g., 0.9). Select a server randomly to serve a service request. B-1(b) server to serve service request with the probability of β and another server to serve the service request with the probability of (1-β). The parameter, β, takes a value in (0, 1] and is closer to 1. This heuristic is same as A-1 applying to B cases. B-2(a) Select a server of one server type (e.g., communication centered server) randomly to serve a service request of the same type (e.g., communication intensive service) with the probability of γ and a server of a different server type (e.g., computation centered server) to serve the service request with the probability of (1-γ). The parameter, γ, takes a value in (0, 1] and is closer to 1.

B-2(b)
Designate a server of each server type as the dominant server of the server type, select the dominant server of one server type to serve a service request of the same type with the probability of γ, and the dominant server of a different server type to serve the service request with the probability of (1-γ). The parameter, γ, takes a value in (0, 1] and is closer to 1.
Page: 9 www.raftpubs.com Select a server of one server type (e.g., communication centered server) randomly to serve a service request of the same type (e.g., communication intensive service) with a given QoS level of L, M or H with the corresponding probability of 1 2 3 , or    respectively, and a server of a different server type (e.g., computation centered server) to serve the service request with the given QoS level of L, M or H with the probability of (1-1    . Heuristic Description C-1 Select a server randomly to serve a service request. If the selected server is full, then a service request is randomly assigned to another server. Drop a service request if its QoS requirement cannot be satisfied by the available resource capacity. This heuristic is same as B-1(a) with the addition of dropping a service request due to the insufficient capacity. C-2 Select a server of one server type (e.g., communication centered server) randomly to serve a service request of the same type (e.g., communication intensive service) with the probability of γ and a server of a different server type (e.g., computation centered server) to serve the service request with the probability of (1-γ). Drop a service request if its QoS requirement cannot be satisfied by the available resource capacity. The parameter, γ, takes a value in (0, 1] and is closer to 1. This heuristic is same as B-2(a) with the addition of dropping a service request due to the insufficient capacity. C-3 Select a server of one server type (e.g., communication centered server) randomly to serve a service request of the same service type (e.g., communication intensive service) with a given QoS level of L, M or H at the corresponding probability of 1 2 3 , or    respectively, and a server of a different server type (e.g., computation centered server) to serve the service request with the given QoS level of L, M or H with the probability of (1-1    . This heuristic is same as B-3 with the addition of dropping a service request due to the insufficient capacity.

Problem cases
We design various problem cases, including the sets (Sets 1 and 3) of problem cases described in Section 2 and two additional sets (Sets 4 and 5) of larger problem cases with large numbers of servers (ten and twenty servers) and service requests, to evaluate the solution obtained from the heuristics in comparison with the optimal solution for each problem case. Two different sets of parameter values in Table 5 are used to examine possible effects of different parameter values on the resource allocation solutions obtained from applying the heuristics. Tables 9-11 show the set-up of the additional larger problem cases.
Page: 10 www.raftpubs.com The heuristics are implemented in an algorithm that is implemented in the C# code in Microsoft Visual Studio 2010. The same laptop computer running the ILOG CPLEX software to obtain the optimal solutions, a Samsung Q320 with Intel Core 2 Duo T6500 2.1 GHz processor, 4 GB RAM, and Windows 7, is used to run the heuristic algorithm. The C# code first loads all the necessary input files including the given inputs as well as service-resource relation functions and resource-QoS relation functions for each service type. With the loaded input files, the C# code then use an appropriate heuristic to solve each problem case and generate the heuristic solution. The computation time of obtaining a heuristic solution is recorded by the C# code. Note that times required for generating output files are also included in the computation time of obtaining the heuristic solution.

Performance measures for the solution optimality
Regarding the solution optimality, two measures are introduced: the number of dropped service requests and the average ratio of actual QoS values that all served service requests receive based on the resource-service assignment. We examine Formula 8 below which is also described in Section 2 and presents the objective function of the optimization problem.
If service request k is served by a server, the minimum limit of QoS requirement of this service must be satisfied, thus  (13) The main part of Formula 10 considers the ratio of the provided QoS to the required QoS to the required QoS for each served service request, which is then summed over all served service requests. The smaller the SAV ratio, the better the minimization of the objective function is achieved. Formula 13 is the second performance measure of the solution optimality and gives the average ratio of the provided QoS to the required QoS to the required QoS for all served service requests.
Therefore, two performance measures, the total number of dropped service requests and the SAV ratio, are used to compare and evaluate the optimal solutions and the heuristic solutions with regard to the solution optimality. Table 12 shows the comparisons between the optimal solutions and the heuristic solutions using the first set of parameters in Table 5 for the problem cases in Sets 1, 3, 4 and 5. The heuristic solutions are compared with the optimal solutions in the following four aspects: the percentage of problem cases whose heuristic solutions are same as the optimal solutions, the percentage of problem cases whose heuristic solutions are different from the optimal solutions, the difference in the number of dropped service requests of the heuristic solution from that of the optimal solution for each problem case shown by the average, range Page: 12 www.raftpubs.com and standard deviation (denoted by SD in Table  12) over all the problem cases in each type (Case A, Case B and Case C), and the comparison in the SAV ratio of the heuristic solution from that of the optimal solution for each problem case shown by the average, range and standard deviation (denoted by SD in Table  12) over all the problem cases in each type (Case A, Case B and Case C). For the 252 Case A problem cases in Set 1, the heuristic solutions for 44.40% of these problem cases are same as the optimal solutions, and the heuristic solutions for 55.56% of these problem cases are different from the optimal solutions. The heuristic solutions serve all the service requests with the same QoS values of the optimal solutions and thus have the same SAV ratio of 1.04 in the average of all the problem cases with the range of 1.02 to 1.08 and the standard deviation of 0.02. As shown in Formula 13, the SAV ratio is an average ratio of the provided QoS to the required QoS for all the served service requests. If a service request is served, its provided QoS is equal to or greater than its required QoS, and thus its SAV ratio is equal to or greater than 1. Like the optimal solutions, the heuristic solutions do not drop any service request due to the sufficient resource capacity at each server to satisfy all the service requests in all Case A problems.

Solution optimality
For the 252 Case B problem cases in Set 1, the heuristic solutions for 58.33% of these problem cases are same as the optimal solutions, and the heuristic solutions for 41.67% of these problem cases are different from the optimal solutions. The heuristic solutions serve the all service requests with the same average, range and standard deviations of the QoS value of the optimal solutions. The heuristic solutions do not drop any service request as same as the optimal solutions due to sufficient resource capacity over all servers to satisfy all service requests of all clients in Case B.
For the 252 Case C problem cases in Set 1, the heuristic solutions for 52.78% of the problem cases are same as the optimal solutions, and the heuristic solutions for 47.22% of the problem cases are different from the optimal solutions. The heuristic solutions have the same SAV ratio of the optimal solutions in the average, range Page: 14 www.raftpubs.com and standard deviation. Since neither any server nor all servers together have the sufficient resource capacity to satisfy all the service requests in Case C, both the optimal solutions and the heuristic solutions drop some service requests. The heuristic solutions drop only about 2 more service requests (or about 4.39% of the service requests) in average than the optimal solutions, and the range of differences is 0.00 to 25.58 with the standard deviation of 4.61. The heuristic solutions drop a few more service requests than the optimal solutions because the heuristic algorithm processes service requests sequentially and thus make temporally local decisions rather than making temporally global decisions by considering all service requests at one time in the optimal solutions. Hence, the heuristic solutions serve two fewer service requests than the optimal solutions in average but provide similar QoS values to those of the optimal solutions for all the served service requests.
For Set 3 of problem cases, the heuristic solutions for 37.70% of the 252 Case A problem cases and 53.57% of the 252 Case B problem cases are same as the optimal solutions, and the heuristic solutions for 62.30% of the 252 Case A problem cases and 46.43% of the 252 Case B problem cases are different from the optimal solutions. The heuristic solutions serve all the service requests with the same average, range and standard deviation of the SAV ratios of the optimal solutions in Case A and Case B. Like the optimal solutions, the heuristic solutions do not drop any service request due to the sufficient resource capacity of each server in Case A or all the servers together in Case B to satisfy all the service requests.
For Case C problem cases in Set 3, the heuristic solutions for 61.90% of the 252 problem cases are same as the optimal solutions, and the heuristic solutions for 38.10% of the problem cases are different from the optimal solutions. The heuristic solutions have similar average, range and standard deviation of the SAV ratios of the optimal solutions. Since neither any server nor all the servers together have the sufficient resource capacity to satisfy all the service requests, both the optimal solutions and the heuristic solutions drop some service requests. The heuristic solutions drop about one more service request in average (1.45% of all the service requests) than the optimal solutions, and the range of differences is -0.92 to 10.87 with the standard deviation of 1.62. Hence, the heuristic solutions serve one fewer service request than the optimal solutions in average but provide similar QoS values of the optimal solutions for all the served service requests. The negative value of -0.92 in the difference of the dropped service requests indicates that the heuristic solution serves about one more service request than the optimal solution, which is unusual. Heuristic algorithm processes service requests sequentially and thus make temporally local decisions to select a service request by considering only resource consumption rather than making temporally global decisions by considering both resource consumption and objective value of all service requests at one time in the optimal solutions.
For Case A and Case B problem cases in Set 4, the heuristic solutions for 43.25% of the 252 Case A problem cases and 17.06% of the 252 Case B problem cases are same as the optimal solutions, and 56.75% of the Case A problem cases and 82.94% of the Case B problem cases are different from the optimal solutions. The heuristic solutions serve all the service requests with the same average, range and standard deviation of the SAV ratios of the optimal solutions in both Case A and Case B. Like the optimal solutions, the heuristic solutions do not drop any service request in both Case A and Case B due to the sufficient resource capacity at each server in Case A or all the servers together in Case B to satisfy all the service requests.
For Case C problem cases in Set 4, the heuristic solutions for 57.94% of the 252 problem cases are same as the optimal solutions, and the heuristic solutions for 42.06% of the problem cases are different from the optimal solutions. The heuristic solutions have similar average, range and standard deviation of the SAV ratios Page: 15 www.raftpubs.com of the optimal solutions. Since neither any server nor all the servers together have the sufficient resource capacity to satisfy all service requests in Case C, both the optimal solutions and the heuristic solutions drop some service requests. The heuristic solutions drop about five more service request in average (2.82% of all the service requests) than the optimal solutions, and the range of differences is 0.00 to 79.92 with the standard deviation of 12.71. Hence, the heuristic solutions serve five fewer service requests than the optimal solutions in average but provide similar QoS values of the optimal solutions for all the served service requests.
For Case A and Case B problem cases in Set 5, the heuristic solutions for 38.89% of the 252 Case A problem cases and 20.24% of the 252 Case B problem cases are same as the optimal solutions, and the heuristic solutions for 61.11% of the Case A problem cases and 79.76% of the Case B problem cases are different from the optimal solutions. The heuristic solutions have the same average, range and standard deviation of the SAV ratios of the optimal solutions in both Case A and Case B. Like the optimal solutions, the heuristic solutions do not drop any service request due to the sufficient resource capacity at each server in Case A or all the servers together in Case B to satisfy all the service requests.
For Case C problem cases in Set 5, the heuristic solutions for 58.33% of the 252 problem are same as the optimal solutions, and the heuristic solutions for 41.67% of the problem cases are different from the optimal solutions. The heuristic solutions have similar average, range and standard deviation of the SAV ratios of the optimal solutions. Since neither any server nor all the servers together have the sufficient resource capacity to satisfy all the service requests, both the optimal solutions and the heuristic solutions drop some service requests. The heuristic solutions drop about ten more service request in average (2.69% of all the service requests) than the optimal solutions, and the range of differences is 0.00 to 168.85 with the standard deviation of 24.74. Hence, the heuristic solutions serve ten fewer service requests than the optimal solutions in average but provide similar QoS values of the optimal solutions for all the served service requests.
The four sets (1,3,4, and 5) of problem cases are run using the second set of parameters in Table  5 to obtain the heuristic solutions. Table 13 gives the comparisons between the optimal solutions and the heuristic solutions. The performance results of the heuristic solutions with the second set of parameters are similar to those with the first set of parameters.

Scalability
The computation times of obtaining the optimal solutions and the heuristic solutions are compared. Table 14 shows the computation times (in seconds) of obtaining the optimal and heuristic solutions for the four sets with the first set of parameters in the heuristics. As Table 14 illustrates, with the increasing number of service requests, the increasing number of servers (from two servers in Set 1 to twenty servers in Set 5), and the increasingly limited resource capacity from Case A to Case C, it takes the longer computation time in average and the larger range of computation times to get the optimal and heuristic solutions. However, the rate of computation time increase with the increasing problem size and complexity is much larger for getting the optimal solutions than getting the heuristic solutions. Hence, using the heuristics is much more scalable than solving the MIP optimization problem. Table  15 shows computation times (in seconds) of obtaining the optimal and heuristic solutions for the four sets of problem cases with the second set of parameters. The computation time results using the second set of parameters are similar to the ones with the first set of parameters. Hence, using the heuristics is much more scalable than solving the MIP optimization problem.

Conclusions
In this paper, we present the heuristics developed to capture the resource-service allocation decisions made in the optimal solutions for various problem cases. We then compare the heuristic solutions with optimal solutions on extensive sets of problem cases in regard to their solution optimality and scalability. The heuristic solutions perform comparable to the optimal solutions in terms of the solution optimality. In Case A problem cases with the sufficient resource capacity at each server and Case B problem cases with the sufficient resource capacity from all the servers together for all service requests, there are no dropped service requests in both the optimal solutions and the heuristic solutions, and all the heuristic solutions have the same average, range and standard deviations of the SAV ratios of the optimal solutions and thus the same QoS performance. For Case C problem cases with the insufficient resource capacity of all the servers together for all the service requests, the heuristic solutions drop a few more service requests (i.e., 1% ~ 4% of all the service requests) than the optimal solutions because the heuristics process service requests sequentially and thus make temporally local decisions rather than making global decisions by considering all service requests at one time in the optimal solutions. Nevertheless, the heuristic solutions provide the comparable QoS performance for all the served service requests to that of the optimal solutions with similar average, range and standard deviation of SAV ratios.
The computation time of obtaining both an optimal solution and a heuristic solution increases as the problem size (the number of service requests and the number of servers) and the problem complexity (from Case A to Case C) increase. However, the rate of computation time increase with the increasing problem size and complexity is much larger for obtaining the optimal solutions than obtaining the heuristic solutions, especially for Case B and Case C with a larger solution space due to the limited and insufficient resource capacity on each server or all the servers together for all the service requests to search for the optimal solutions. Hence, using the heuristics is much more scalable in real time than solving the MIP optimization problem.