Cost-Aware Multifaceted Reconfiguration of Service-and Cloud-Based Dynamic Routing Applications

Dynamic reconfiguration is commonly used in service-and cloud-based applications. In combination with autoscalers, dynamic routers can adapt the system to the resource demands, e.g., in an e-commerce application offering discounts for services in a specific location. Without such measures, the quality-of-service measures are affected negatively, and a system overload can lead to an application being non-responsive. However, the cost of cloud resource usage must be considered when performing these reconfiguration steps to avoid adding high additional costs. This paper proposes a cost-aware multifaceted reconfiguration of dynamic routing applications. We study the depletion and rescheduling of idle components and use an infrastructure-as-code module to apply changes to the infrastructure. Moreover, when system components are in a steady state, our approach dynamically self-adapts between more central or distributed routing to optimize reliability and performance. This adaptation is calculated based on a system-wide optimization analysis. When components are overloaded, we perform a per-component optimization to autoscale components multidimensionally. Our extensive systematic evaluation shows significant improvements in quality trade-off adaptations and system overload prevention. We provide prototypical tool support to demonstrate our concepts with illustrative sample cases.


I. INTRODUCTION
C Loud-based systems require dynamic routing for ef- ficient performance.Due to the constantly changing nature of modern applications, dynamic routers such as API Gateways [27], Enterprise Service Buses [8], Message Brokers [14], or Sidecars [16] are typically utilized.These routing patterns may differ in implementation, but all serve the purpose of routing or blocking requests.To switch between these dynamic routing patterns, the number of routers in a service-and cloud-based system can be adjusted.However, monitoring the quality of service measures and making architectural decisions automatically is essential.Designing routing architectures requires careful consideration of both reliability and performance.Adding more routers to improve performance may increase the risk of system crashes due to introducing additional points of failure.
Moreover, when adapting the routing architecture pattern from distributed to centralized routing (and vice versa), we should ensure that the components are not overloaded.Cloud computing provides an elastic infrastructure to manage this dynamic behaviour.Horizontal autoscaling, i.e., adding or removing replicas, and vertical autoscaling, i.e., adding or removing resources, are commonly used in practice.A newer concept is multidimensional autoscaling 1 that combines the two previous methods in one decision-making step.However, the concept is not fully developed and has limitations, such as not considering the incoming load as an input.
Consider, for instance, an e-commerce shop that offers discounted products for a specific location.The application must cope with a sudden incoming load increase that needs to be routed to these services.Dynamic routers and autoscalers can accommodate the increased demand.However, when adding components for the parallel processing of requests to increase performance, a reliability decrease is observed as there are more points of failure [2] in a system.Without such measures, a system overload can lead to an application being non-responsive.Nevertheless, if cloud resource costs are not considered, a business may lose profit by inducing high costs when dealing with sudden load spikes.To address such scenarios, we set out to answer the research questions: RQ1: Can we find a cost-aware multifaceted reconfiguration approach for dynamic routing applications to adapt qualityof-service trade-offs and prevent component overloads?RQ2: What is the architecture of a supporting tool that facilitates the reconfiguration of a dynamic routing application using the optimal configuration solution?RQ3: How well does this multifaceted reconfiguration perform compared with the case where one architecture runs statically?
The contributions of this paper are as follows.Firstly, we model components as queuing stations [17] and consider different scenarios, i.e., when components are idle, steady, and transient.We introduce a cost-aware multifaceted reconfiguration of dynamic routing applications using an Infrastructure as Code (IaC) module to apply changes to the infrastructure.Moreover, we consider a system-wide Multi-Criteria Optimization analysis (MCO) [1] to optimize system reliability and performance, as well as a per-component MCO to autoscale components multidimensionally.Secondly, we provide a prototypical tool that facilitates the reconfiguration of dynamic routing applications.Our application provides artifacts to be used by IaC tools and a visualization environment to study different configurations and demonstrate our concepts.
To evaluate our approach, we consider multiple levels of call frequencies, component configurations, and routing profiles that we studied in an already-published empirical study [3].Our extensive systematic evaluation shows significant improvements in quality trade-off adaptations and system overload prevention.Our approach yields up to 16.60% reliability gain and an average performance gain of 74.22%.
The structure of the paper is as follows: Section II presents an approach overview.Section III explains our approach in detail, and Section IV gives illustrative sample cases.Section V provides our prototypical tool support.Section VI presents the evaluation of the presented approach, and Section VII discusses the threats to the validity of our research.We study the related work in Section VIII and conclude in Section IX.

II. APPROACH OVERVIEW
In this paper, we study a cost-aware multifaceted reconfiguration of dynamic routing applications.A router is defined as an abstraction for any controller component that makes routing decisions, e.g., an API Gateway [27], an enterprise service bus [8], or Sidecars [16].We model the system components, i.e., services and routers, as queuing stations [17] having two subcomponents, namely a buffer and a processor as shown in Figure 1.Let λ be the arrival rate and μ the processing rate of a component based on the number of requests per second r/s.Incoming requests are buffered in a queue by a rate of λ and processed by a rate of μ.

Fig. 1: Components as Queuing Stations
A component is in a steady state when its processing rate is greater than or equal to its arrival rate: In the steady state, a component is not overloaded and can process incoming requests without delay because of buffering.
On the other hand, the transient state refers to when a component is overloaded because its processing rate is lower than the arrival rate of the requests: We study three interrelated scenarios for components: Cost of increasing the processing rate in cents/s C(n scal , n pro ) Cost of multidimensional autoscaling in cents/s • when components are idle and can be depleted.
• when components are active and steady.
• when components are overloaded.The first scenario considers the infrastructure changes when a reconfiguration occurs.The second scenario studies a per system reconfiguration, which means we monitor the state of a system as a whole and reconfigure the components.The third scenario is a per component reconfiguration, i.e., our approach monitors and reconfigures each component separately.
We study our empirical data set already reported in [3] to present illustrative examples (see Section IV) and to evaluate our approach.In our prior work, we performed an extensive experiment of 1200 hours and measured the quality-of-service metrics of dynamic routing applications.Our data set can be downloaded in the online artifact of this paper to support reproducibility 2 .Table I presents our mathematical notations.

III. APPROACH DETAILS
This section presents the details of our approach.

A. Depletion of Idle Components
Some system components process requests sporadically and are idle between active periods.We characterize the sporadical load profile with a frequency of incoming requests for an active period followed by a delay of no incoming requests.As shown in Figure 2, we deplete components when idle (represented by dots).However, the depleted components can become active again and must be rescheduled.So we must consider the infrastructure changes, e.g., not overloading cloud nodes.We use an IaC module to automatically create and free cloud nodes to efficiently use resources and reduce costs.On the one hand, when depleting idle components, an efficient rescheduling of other active components might free a node.On the other hand, when a depleted component receives a request and needs to be rescheduled, all nodes might be occupied.The IaC module can create a node and schedule the component.
Assume the capacity of each node is known based on the number of scheduled containers.Algorithm 2).When a request is received for a depleted component, Algorithm 1 schedules it either on existing nodes or a new one.

B. Reconfiguration of Steady Components
When all components are active and steady according to Equation (1), we consider a system-wide MCO [1] optimizing reliability and performance trade-offs of the system as a whole.
1) Definitions: We define the following model elements in our reliability and performance models.n rout and n serv are the number of routers and services, and CI is the crash interval, i.e., the interval during which we check for a crash of a component.Assuming the heartbeat pattern [15] or the health check API pattern [24] are used, CI is the time between two consecutive health checks.cf is the call frequency (r/s), Com is the set of components, i.e., routers and services, P c is the crash probability of each component, and d c is the average downtime of a component after it crashes.
2) Reliability Model: Based on Bernoulli processes [31], request loss R during component crashes is modeled [2] as: In this formula, request loss is defined as the number of client requests not processed due to a failure, such as a component crash.Equation ( 3) gives the request loss per second as a metric of reliability by calculating the expected value of the number of crashes.Having this information, we sum all the requests received by a system during the downtime of a component and divide them by the observed system time.
3) Performance Model: We model the average processing time of requests per router as a performance metric P .This metric is important as it allows us to study the quality of service factors, e.g., the efficiency of architecture configurations.
We count the processed requests in this formula by subtracting the request loss from the total requests.We divide the observed time by the processed requests and the number of routers.Section IV-A presents an illustrative sample case.

4) System-Wide MCO:
We perform a multi-criteria optimization analysis to reconfigure an application by adjusting n rout .We use the notations R(n rout ) and P (n rout ) to specify the reliability and performance predictions of an architecture configuration by their number of routers.Let R th and P th be the reliability and P th performance thresholds.We aim to minimize request loss and average processing time of requests per router without the prediction values violating R th and P th .
Additionally, we must ensure that the reconfiguration costs do not exceed a cost threshold.We define C(n rout ) as the reconfiguration costs for an architecture configuration by its number of routers and C th as the cost threshold.Moreover, when choosing a lower n rout , we must ensure that reducing Algorithm 2: System-Wide Optimization Analysis (systemWideMCO) the number of routers does not overload the routers.Let Rout be the set of routers of a system: Subject to R(n rout ) ≤ R th (7) Typically, there is no single answer to an MCO problem but a set of acceptable points in the solution space [1].Algorithm 2 provides a simple solution to find a range of acceptable n rout .The lower end of this range represents more centralized routing, so we find the lowest acceptable n rout that does not violate the performance threshold.Conversely, the highest possible n rout is bound by the reliability threshold.Having found the lower and upper values, we exclude the solutions that violate the cost threshold or result in overloading a router.

5) Preference Function:
We must choose a final reconfiguration solution on the n rout range returned from the above analysis.An architect assigns weights to reliability and performance, so a preference function can automatically choose a final solution.For example, when performance is highly important, the preference function selects a higher n rout to choose more distributed routing.This reconfiguration processes requests in parallel, giving a higher performance.6) Reconfiguration Algorithm: Algorithm 3 presents our reconfiguration steps triggered, for instance, whenever reliability or performance metrics degrade.Time intervals, manual triggering, or changes in the incoming load can also trigger the algorithm if more appropriate than metrics degradation.

C. Autoscaling of Overloaded Components
In this paper, we study a multifaceted reconfiguration of dynamic routing applications.When a system component is in a transient state (see Equation ( 2)), request processing is delayed because of buffering in an overloading component.In this case, we use multidimensional autoscaling 3 to bring the transient component to a steady state.To clarify, we consider two reconfiguration measures in a per-component MCO analysis: horizontal autoscaling, i.e., scaling out the component, and vertical autoscaling, i.e., adding resources.
1) Buffer Fill Rate: We define the Buffer Fill Rate (BFR) as the difference between the arrival and processing rates.
BFR is an indicator that a component is in a transient state.In this case, we reconfigure an overloaded component.We define n scal as the number of scaling replicas, n pro as the number of processing rate improvements, and BF R(n scal , n pro ) as buffer fill rate predictions for multidimensional autoscaling.
In this formula, n req is the number of incoming requests for a component, and cf is the call frequency of requests.Equation ( 13) comes from the fact that scaling out an overloading component divides its arrival rate by the total number of replicas, i.e., n scal + 1.The BF R is also affected by the added processing rate, i.e., μ + n pro .
2) Reconfiguration Cost: The cost of reconfiguration must be considered.Let C(n scal , n pro ) be the cost of multidimensional autoscaling, C(n scal = 1) the cost of scaling out a component by one replica, and C(n pro = 1) the cost of increasing the processing rate of an overloading component by one r/s.The cost depends on the n scal and n pro improvements.
Section IV-B presents a parameterization and a sample case.
3) Per-Component MCO: We adjust the buffer fill rate of an overloading component to bring it to a steady state.This reconfiguration is based on a second multi-criteria optimization analysis performed for each component separately.We aim to minimize BFR but with a minimum reconfiguration cost. ) Remember that there is typically no single answer to an MCO problem but a set of acceptable points called the Pareto front [1].Using a preference function, we choose a final solution that brings the component to a steady state according to Equation (1) with a minimum cost.Having done this analysis, all the components are in a steady state.We must perform a system-wide analysis as described in Section III-B.Algorithm 4 presents the reconfiguration steps.

IV. ILLUSTRATIVE SAMPLE CASES A. Reconfiguration of Steady Components
We study an example from the data set of our experiment 2 (see Section II for details) to parameterize our models and give sample cases.An example configuration is shown in Figure 3, where clients send requests to an API gateway that forwards them to the services.We observed the system for T = 600 s, had a crash interval of CI = 15 s and studied uniform crash probabilities and downtimes for all components as P c = 0.5% and d c = 3 s.We can parameterize our reliability model (r/s) and performance model (ms) in Equations ( 3) and (4) as: In the example configuration, we have n rout = 3 routers and n serv = 6 services.Let us consider that this sample case has an expected call frequency of cf = 25 r/s, and all routers have a processing rate of μ = 64 r/s.We parameterize the reconfigSolution ← solution end end return reconfigSolution end arrival rates and the number of incoming requests of routers (solid arrows in Figure 3) to check if they are overloaded.In our experiment, we allocated services equally to routers: In our sample case, n req = 2 and λ r = 50 r/s.Therefore, all routers are steady according to Equation (1).
To parameterize the cost functions, we use the Google Autopilot pricing 4 .Autopilot allows increments of 0.25 vCPUs per container (same is offered by Amazon Fargate5 ) that corresponds to 8 r/s in our experiment: The scaling cost of our routers with μ = 64 r/s accounts to: We consider a reliability threshold of 1.2 r/s, a performance threshold of 35 ms, and a cost threshold of 1 cent/s.We study a case with a weight of 1.0 for performance and 0.0 for reliability.We perform the system-wide MCO analysis in Section III-B4 by rewriting Equations ( 19) and ( 20 Equation ( 25) informs that the reliability predictions in the 1 ≤ n rout ≤ 6 always satisfy the reliability threshold.In Equation ( 26), the constraint on the performance threshold of P nrout ≤ 35 ms gives the lowest value for the number of routers as n rout = 2. Therefore, the range for n rout is: Following Algorithm2, we see that the cost threshold of 1 cent/s is always satisfied in this range.We check if any solution results in overloading the routers in this range.On the lowest bound, i.e., n rout = 2, we have the following according to Equations ( 21) and (22): Since μ r = 64 r/s, this overloads the routers according to Equation (1).So we exclude this solution, and all the other points on the range are acceptable.The acceptable range is: The performance weight is 1.0, so the preference function chooses the highest possible value for n rout according to Algorithm 3. Therefore, the final solution is a configuration with six routers, i.e., n rout = 6.We use this analysis also when illustrating our other scenario, i.e., autoscaling transient components multidimensionally to prevent system overload.

B. Autoscaling of Overloaded Components
Let us consider the studied example in Figure 3. Assume this application is stressed with a call frequency of cf = 100 r/s.
According to Equations ( 13), ( 21) and ( 22), we have: Having the same cost threshold of C th = 1 cents/s, we can rewrite the per-component MCO analysis in Section III-C3: As mentioned in Section III-C3, we choose a final solution that brings the component to a steady state with a minimum cost.Following Algorithm 4, this reconfiguration solution is: that gives the buffer fill rate of BF R(1, 40) = −4.This solution results in scaling out each router and increasing n rout from three to six routers.Therefore, we must check that the system-wide MCO does not violate the thresholds.As we calculated before in Equation ( 35), the acceptable range of routers is 3 ≤ n rout ≤ 6.So the solution is acceptable.

V. TOOL SUPPORT
We provide a prototypical tool in our online artifact 2 .Figure 4 shows the high-level tool architecture.The Web Frontend of our application provides the functionalities to specify architecture configurations and model elements, such as thresholds and cost functions.This information is sent to the RESTful API in the backend that invokes the Optimizer to perform MCO analyses and find the final reconfiguration solution.The IaC Module generates artifacts in the form Figure 5 shows the flow regarding the model reconfiguration.An architect specifies various model elements, i.e., the number of routers and services, thresholds, incoming call frequencies, performance weight, processing rates of components, and cost functions.A reconfiguration is triggered when metrics degradation is observed, according to timers or manually.When reconfiguration is triggered, the backend performs an MCO analysis, chooses a final reconfiguration solution, and generates IaC artifacts.The reconfiguration visualization is then created using PlantUML and shown in the frontend.

VI. EVALUATION
This section evaluates our approach in both scenarios illustrated in Section IV systematically.We compare the model values to our empirical data set (see Section II).Note that our study is neither specific to our experiment infrastructure nor our cases.We use our empirical data set to evaluate our approach using measured data from an extensive experiment.

A. Reconfiguration of Steady Components
We present our evaluation when components are steady.1) Evaluation Cases: We systematically evaluate our method through various thresholds and importance weights for reliability and performance.We compare our model predictions with 9 experiment cases: three levels of routers and three levels of services, each operational for four levels of cf .
Regarding the Cost Threshold, we take C th = 1 cent/s as in our illustrative sample cases.For the processing rates, we investigate 9 levels as follows.In Section IV-A, we mentioned that a component with one vCPU has a processing rate of roughly 32 r/s in our experiment.We start with components having two vCPUs up to six in increments of 0.5 vCPUs.
64 ≤ μ ≤ 192 r/s (48) Regarding reliability and performance thresholds, we start with tight reliability and loose performance thresholds so that more centralized routing is acceptable (lower value of n rout ).We increase the reliability and decrease the performance thresholds by 10% in each step so that distributed routing becomes applicable.To find the starting points, we consider the worst-case scenario of our empirical data.Equation (3) informs that a higher n serv results in a higher expected request loss as the number of components increases.In our experiment, the highest number of services is ten.With n serv = 10, the worstcase reliability for centralized routing and fully distributed routing (n rout = 10) is 1.1 and 2.0 r/s, respectively.
Regarding performance, for the case of n serv = 10, we investigate our predictions to find a range where a reconfiguration is possible.The lowest possible performance prediction is 33.7 ms, and the highest is 101.1 ms.We adjust these values slightly and take our boundary thresholds as follows.We analyze step-by-step by increasing the reliability threshold and decreasing the performance threshold by 10% as before.
2) Results Analysis: We evaluate 9801 systematic evaluation cases: 9 experiment cases, 9 processing rate levels, 11 importance weights, and 11 thresholds.To support reproducibility, the evaluation script and the evaluation log are provided in the online artifact of this study 2 .We define reliability gain, i.e., RGain, and performance gain, i.e., P Gain, as the average percentage differences of our predictions compared to those of fixed architectures.These formulas are based on the Mean Absolute Percentage Error (MAPE) widely used in the cloud quality-of-service research [31].
Remember R nrout and P nrout are reliability and performance predictions.The gains are averaged over 9 experiment cases.
Figure 6 shows the reliability and performance gains.Moreover, each figure shows the plots for our lowest studied processing rate of μ = 64r/s and the highest bound in our research, i.e., μ = 192 r/s.Regarding reliability, we can see in Figure 6a that with a higher reliability weight, we have an increase in reliability gain with μ = 192 r/s.Remember in Algorithm 2, we check that the components are not overloaded when choosing a more centralized routing to increase reliability.Having a higher processing rate results in a component processing higher call frequencies without being overloaded.However, as a result of choosing a less centralized routing, the gain in reliability is at most 16.60%.
Our approach provides significant improvements in performance gains.As more importance is given to the performance of a system, i.e., performance weight increases, our approach reconfigures an application by choosing more distributed routing.This reconfiguration results in a rise of a performance gain as shown by Figure 6b.On average, when cases with correct and incorrect architecture choices are analyzed together, our adaptive method provides 74.22% performance gain.A higher gain for performance compared to reliability is expected.To clarify, Equations ( 19) and ( 20) inform that changing the number of routers has a higher effect on the performance than a system's reliability.Our paper defines performance as the average processing time of requests per router.Having a higher number of routers to process the requests in parallel results in dividing the average processing time by more routers.

B. Autoscaling of Overloaded Components
This section systematically evaluates our approach regarding component overload prevention.
As before, we study increments of 0.5 vCPUs resulting in 9 μ levels.We consider the same range of call frequencies.However, since we are studying component overloads, we evaluate increments of 5 r/s, resulting in 19 cf levels.
2) Results Analysis: We evaluated 1539 cases for this scenario, i.e., three levels of n serv , three levels of n rout , 9 levels of μ, and 19 levels of cf .We define the average cost C and the average reconfiguration ratio RR, that is, the amount of BFR improvements per cost spent as: BF R(0, 0) is the buffer fill rate without reconfiguration.We average over three levels of services and three levels of routers.
In Figure 7a, we can see that the reconfiguration costs increase as the processing rate and the call frequency increase.This is expected as reconfiguring a component with a higher processing rate is more expensive, especially when scaling out the overloaded component.Moreover, a higher incoming call frequency results in more overloaded components and, consequently, higher reconfiguration costs.However, as seen in Figure 7b, a higher cf results in a higher reconfiguration ratio.Our approach balances the costs with a bigger buffer fill rate improvement converging RR to an average of 2.62.The average reconfiguration cost over all cases is 0.0065 cents/s bringing all overloading system components to a steady state.

VII. THREATS TO VALIDITY
We discuss the four threat types by Wohlin et al. [32].Regarding construct validity, we used request loss and the average processing time of requests per router as reliability and performance metrics, respectively.The threat remains that other metrics might model these quality attributes better, e.g., a cascade of calls beyond a single call sequence for reliability [22], or data transfer rates of messages which are m byte-long for performance [19].Moreover, we studied reconfiguration measures of increasing the processing rate and scaling out a component to prevent system overload.While this is a common approach in service-and cloud-based research (see Section VIII), other measures might work better in terms of system overload prevention, for instance, changing the routing technology, e.g., using a circuit breaker [26].More research with real-world systems is required to study.
Regarding internal validity, we considered a simple reconfiguration strategy to start the new setup in parallel with the running configuration to avoid impacts on reliability, e.g., request loss due to reconfiguration, and performance, e.g., increased processing time while reconfiguring.In a realworld system, this solution is cost-ineffective that introduces additional resource demands.The architects must specify a reconfiguration strategy based on their application needs to mitigate this threat.Moreover, we only considered constant load when modeling the stress of components using queuing theory.In reality, cloud-based systems are met with different load profiles, e.g., sudden load spikes.In future work, we plan to study more aspects of the proposed novel approach.
Concerning external validity, we designed our novel architecture with generality in mind.However, the threat remains that evaluating our approach based on another infrastructure may lead to different results.To mitigate this thread, we evaluated our proposed approach with an extensive systematic evaluation using the data of our experiment of 1200 hours (see Section VI).Moreover, the results might not be generalizable beyond the given experiment cases of 10-100 requests per second and call sequences of length 3-10.As this covers a wide variety of loads and call sequences in cloud-based applications, the impact of this threat should be limited.Moreover, there must be a balance between the applicability of the proposed approach and the level of abstraction of the presented ideas, as in all research presenting models of a real-world phenomenon.To mitigate this thread, we performed many rounds of reviews and improvements in the author team and constantly compared them with the related work.
Concerning conclusion validity, as the statistical method to evaluate the accuracy of our prediction models, we used the Mean Absolute Percentage Error (MAPE) metric [31] metric.We defined reliability and performance gains, as well as the average percentage difference of buffer fill rate based on MAPE, as it is widely used and offers good interpretability in our research context.

VIII. RELATED WORK
The proposed approach is related to self-adaptive systems, which typically use MAPE-K loops [4], and similar methods to realize adaptations.We extend such studies with support specific to the service-and cloud-based dynamic routing applications.Moreover, research on efficient resource provisioning, e.g., [10], [18], and cloud elasticity, e.g., [12], [13], are related to our work.Our study extends these approaches by considering the increase in the processing power of a component as a reconfiguration measure.Moreover, we consider a multifaceted reconfiguration of components taking into account systemwide and per-component optimizations.
Architecture-based decision making [1], [28] uses architectural tactics to search for (Pareto) optimal architectural candidates.Architecture-based analysis approaches based on queueing theory have been studied by, e.g., [23], [30].Like our study, those works focus on supporting architectural design and decision-making.In contrast to our work, they do not focus on specific kinds of architectures or architecture patterns in dynamic routing to prevent system overload.Our approach differs from these in focusing specifically on service-and cloud-based dynamic routing architectures.
We extend research on auto-scalers for the cloud, e.g., [5], [33], by adding specific cost studies.A particular related work is [20], which introduces a cost-aware component that can be added to auto-scalers.Lesch et al. use workload prediction to decide on reconfiguration costs.Our approach differs from this work in that they study the reconfiguration of cloudbased Virtual Machines (VMs).Therefore, they consider timebased cost functions to rent these VMs.Nevertheless, we study components and consider cost functions related to the number of resources an application uses.Moreover, we consider the transient analysis, i.e., when components are overloading and when they are in a steady state.
Multidimensional auto-scalers have been studied in the literature.AutoMAP [6] uses response time triggers to provision resources.AutoMAP finds optimal resources using Virtual Machine (VM) image sizes to support cost efficiency.Nguyen et al. [21] provide a forecasting model to predict CPU demand and use these predictions to start new machines before load peak to increase performance.CloudScale [29] supports scaling of CPU and memory resources when local scaling is possible.Otherwise, it migrates VMs to prevent overloaded hosts.Our work differs from all these studies because they consider auto-scaling at the VM level and configure the resources.We proposed a cost-aware multidimensional auto-scaler that works at the component level, adjusting the resources of each component.
In contrast to the existing related work, the major contributions of our study are that we proposed a model of system overload specifically designed for dynamic routing in service-and cloud-based architectures.Moreover, we perform multifaceted reconfiguration considering different states components can be in.Having this specific view and considering possible runtime adaptations, we defined multiple targeted reconfiguration algorithms to perform multi-criteria optimization analyses and find the (Pareto) optimal reconfiguration solutions.This would be hard to do in the generic case.

IX. CONCLUSIONS
In this paper, we proposed a multifaceted reconfiguration approach that self-adapts between different routing patterns considering the component overloads and idleness (RQ1).Moreover, we provided a prototypical tool that provides visualizations to study different architecture configurations (RQ2).We systematically evaluated our approach based on our empirical data (see Section II for details).Our extensive systematic evaluation shows significant improvements in quality trade-off adaptations and system overload prevention.In our experiment cases, our approach can yield up to 16.60% reliability gain.On average, where cases with the right and the wrong architecture choices are analyzed together, our approach offers a 74.22% performance gain (RQ3).Our architecture adapts, based on triggers, e.g., change of incoming load frequency or degradation of monitoring data, to an optimal configuration to adapt quality trade-offs and prevent system overload.Before our work, architects needed to redesign and redeploy architecture configurations manually.For our future work, we plan to apply our novel approach to real-world applications and evaluate the results.Moreover, we plan to cover different load profiles when autoscaling components multidimensionally.

Fig. 6 :
Fig. 6: Reliability and Performance Gains with Processing Rates of μ = 64 and 192 r/s

Fig. 7 :
Fig. 7: Plots of Evaluation Data for the Autoscaling of Transient Components

TABLE I :
Table of Mathematical Notations r Buffer fill rate of a router r in r/s BF R(n scal , n pro ) Buffer fill rate for autoscaling in r/s μ Processing rate of a component in r/s μ r Processing rate of a router r in r/s λ Arrival rate of a component in r/s λ r Arrival rate of a router r in r/ ):