,Class,Cite,authors,year,Venue type,journal,Venue ABBRV,,Migration phase,Task Automated,ML Integration,Automation degree,Output,Input type,Input type Category,Granularity of Data,Source of Data,Data preprocessing,ML Technique,Ml technique Category,Learning Approach,Feature SelectionThe primary feature for clustering is estimated job execution time​,Evaluation Metrics,Metric Category,Key_word,Qualitative Assessment,Key_word,Performance Benchmarks,Key_word,Experimental Setup,Key_word,Comparison to Other Approaches,Key_word,Success Criteria,Key_word,Tool availability,Key_word,RQ5 From legacy to microservices: A type-based approach for microservices identification using machine learning and semantic analysis,Identification,trabelsi2023_type_based_microservices,"Trabelsi, Imen and Abdellatif, Manel and Abubaker, Abdalgader and Moha, Naouel and Mosser, Sébastien and Ebrahimi-Kahou, Samira and Guéhéneuc, Yann-Gaël",2023,Journal,Journal of Software: Evolution and Process,JSEP,Wiley,Identification,"the data transform and preprocessing, the classification phase, and the identification","sematic analysis, classsification and clustering","automated: require the input of number of microservices, few labeled data, and the threasholds",Microservoces,source code,Source Artifacts,classes,public repositories,"transofrm to KDM then to graph whith edges are the static relations and nodes are the classes, and the features are semantic analysis using code bert and word to vector","Code bert, word2vec, SVM, Fuzzy C mean",Classical ML,"suvervised for classification, unsupervised for clustering",relations and semantic analysis,"accuracy, precision, recall, f-measure, cmq, smq, chm, chd, ifn","Classification and Prediction , Software Design","accuracy, precision, recall, f-measure, cmq, smq, chm, chd, ifn","qualitative metrics: accuracy, precision, recall, f-measure, ","accuracy, precision, recall, f-measure, ",Ground truth manually created manually for the labeled systems and microservices,manually created GT,Experiments were conducted on 4 Java-based projects ,4 systems,compared to 2 state of the art approaches,compared to ServiceCutter and topic modeling approach,better precision and quality values,68.15% precision and 77% recall.,available,available,"not enought data, the need to labeled data manualla" Graph Neural Network to Dilute Outliers for Refactoring Monolith Application,Identification,desai2021_gnn_outlier_refactoring,"Desai, U and Bandyopadhyay, S and Tamilselvam, S and Assoc Advancement Artificial Intelligence",2021,Conference, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE,AAAI,Compendex,Identification,Identification of microservices,detecting outliers and clustering classes,automated: the user should number of resulted microservices,clusters of classes representing microservices,Source code of monolith applications,Source Artifacts,Classes,Publicly available monolith applications,"Static analysis to extract method calls, entry points, class inheritance, and execution traces, followed by the construction of a graph ",Graph Convolutional Networks (GCNs).,Graph Based ML,"Unsupervised learning, with a focus on minimizing outlier effects while clustering classes into microservices.","Features include class dependencies, execution traces, and inheritance relationships","Modularity, Structural Modularity, Non-Extreme Distribution (NED), Interface Number (IFN).",Software Design,,Manual evaluation by software engineers,,four publicly available monolith applications,,"The experiments were conducted on four publicly available monolith applications, evaluated using metrics ",,"Deepwalk (Per-ozzi, Al-Rfou, and Skiena 2014), Node2vec (Grover and Leskovec 2016), ONE (Bandyopadhyay, Lokesh, and Murty2019) GCN (Kipf and Welling 2016) and DGI (Veliˇckovi´c et al. 2019) using k-means++",,"Improvement in modularity, structural modularity, and qualitative agreement with human annotators for outlier detection.",,The tool is publicly available ,available,"The subjective nature of evaluating refactorable classes creates challenges in validating the approach, The presence of outlier classes complicates clustering" A Microservices Identification Method Based on Spectral Clustering for Industrial Legacy Systems,Identification,zhong2023_spectral_clustering_industrial_legacy,T. Zhong and Y. Teng and S. Ma and J. Chen and S. Yu,2023,Workshop,2023 IEEE Globecom Workshops (GC Wkshps),GC wkshps,IEEE Xplore,Identification,"data extract, clustering for identification or microservices",in the clustering phase,heighly automated,Microservoces,excutable source code,Source Artifacts,"classes, methods and performance logs","3 java systems: Jpetstore, solo, Spring blog","Static Analysis for call between methods, Dynamic Analysis for Performance log and to capture the runtime characteristics of the legacy system: CPU runtime and memory occupation","graph-based spectral clustering algorithm: U. Von Luxburg, “A tutorial on spectral clustering,”",Graph Based ML,unsupervised,Performance log and to capture the runtime characteristics of the legacy system: CPU runtime and memory occupation and relations between methods,weighted modularity quality (MQw): a modified MQ metric,Software Design,weighted modularity quality (MQw),using the quality metric MQw,MQw,comparing to state of the art approaches,comparing MQw with two other approaches,3 java systems,3 java systems,"MEM : Extraction of microservices from monolithic software architectures. FOSCI: Service candidate identification from monolithic systems based on execution traces,” ","MEM, FOSCI ",Proposed Fusion method surpasses other baseline methods in terms of extracting highly modular microservice candidates.,MQw heigher then state of art approaches,available,available, CARGO: AI-Guided Dependency Analysis for Migrating Monolithic Applications to Microservices Architecture,Identification,nitin2022_cargo_dependency_analysis,"Nitin, V and Asthana, S and Ray, B and Krishna, R and ASSOC COMPUTING MACHINERY",2022,Conference,"PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022",ASE,Compendex,Identification,identification of microservices,refining the partitioning,"automated: the user should provid: contextual information about the monolithic application, domain-driven divisions, feedback on the initial partitioning,",cluster of methods functional core of new microservices,source code and database interactions,Source Artifacts,method level,open-source and proprietary enterprise application,"static and dynamic analysis to identify method calls, data flow dependencies, and database interactions.",Context-sensitive Label Propagation (LPA),Classical ML,Unsupervised learning,"method dependencies, transactional dependencies, and heap dependencies between objects and methods.","Database Transactional Purity, Latency and throughput , cohesion and coupling, Inter-call Percentage (ICP) , Business Context Purity (BCP)","System Behavior, Software Design",,Performance Metrics and Architectural Metrics (same as evaluation metrics),Performance Metrics and Architectural Metrics,5 systems used in the data and daytrader benchmark,,"The experiments are conducted on five applications (4 open-source, 1 proprietary) and on a benchmark application deployed under various loads to measure performance improvements.",,"FoSCI, CoGCN, Mono2Micro",,reduces distributed transactions and increases throughput by 120% and decreases latency by 11%,,open-source tool on GitHub.,available,difficulty of applying ML-based methods for microservice identification in complex enterprise systems Facilitating the migration to the microservice architecture via model-driven reverse engineering and reinforcement learning,Identification,dehghani2022_migration_model_driven_rl,"Dehghani, M and Kolahdouz-Rahimi, S and Tisi, M and Tamzalit, D",2022,Journal,SOFTWARE AND SYSTEMS MODELING,SSM,Compendex,Identification,Identification of microservices from monolith system,method-to-microservice mappings based on nanoentities and system models.,automated: some paramters are specified by the user,MIcroservices in form of clusters of nanoentities,"source code of the system, entity-relationship (ER) model, and use case model.","Source Artifacts, Model Artifacts",methods and nanoentities,publicly available code from GitHub,Preprocessing involves extracting a model of the source code using MoDisco and generating nanoentity models via Service-Cutter.,Reinforcement Learning (specifically Deep Q-Learning) ,Reinforcement Learning,Reinforcement Learning,"static features (like method calls, class hierarchies) and semantic features (like method meanings or embeddings) ",normalized accuracy,Classification and Prediction,normalized accuracy,Manual validation is used to compare RL-based mappings with ideal microservice decompositions.,Manual validation is used to compare RL-based mappings with ideal microservice decompositions.,-,-,the framework is tested on five case studies,The framework is tested on five case studies,"compared against Random Search and Brute Force methods,","Random Search and Brute Force methods,",heigh accuracy for all the systems,heigh accuracy for all the systems,The tool is available as an open-source GitHub repository,available as an open-source GitHub repository,"Reward Function Design: Designing an effective reward function for RL is challenging, as it must balance factors such as nanoentity ownership and method-to-microservice coupling, Scalability: Training the RL model for larger systems (with more than 9 microservices) requires additional time and computational resources." Heterogeneous Data-Driven Failure Diagnosis for Microservice-Based Industrial Clouds Towards Consumer Digital Ecosystems,Monitoring,xu2023_heterogeneous_failure_diagnosis,"Xu, Y. and Qiu, Z. and Gao, H. and Zhao, X. and Wang, L. and Li, R.",2023,Journal,IEEE Transactions on Consumer Electronics,TCE,IEEE Xplore,Monitoring,The method automates the failure diagnosis process by identifying root causes and failure types,"Identify root causes, and classify failure types.",highly automated,location and type of failure (root cause localization),"Metric data: system resource usage (e.g., CPU, memory, and disk usage) and failure propagation information.",Runtime Artifacts,service level and component level.,three public real-world microservices-based datasets,"Includes time-series metric sampling, dependency analysis between failure units, and building of the HFDG to map relationships","Relational Graph Convolutional Network (RGCN), combined with Gated Recurrent Units (GRU)",Graph Based ML,supervised,"Features include resource metrics (CPU, memory, etc.) and relationships between microservices and failure units.",Top-k accuracy (A@k) and Mean Average Rank (MAR) are used to evaluate the prediction of root cause failures.,Classification and Prediction,Top-k accuracy (A@k) and Mean Average Rank (MAR) ,The approach is compared to human-labeled failures in the dataset and assessed for qualitative performance based on correctly identified root causes.,Compared to ground truth,"The approach is compared against several state-of-the-art models,","compared against several state-of-the-art models,","The evaluation includes the use of real-world microservices datasets, with results evaluated on GPU and CPU-based cloud systems.",GPU and CPU-based cloud systems.,"DejaVu, iSQUAD, and graph-based methods","DejaVu, iSQUAD, and graph-based methods","Success is measured by improvements in accuracy and ranking of failure predictions compared to baselines (e.g., lower MAR and higher A@k).",improvements in accuracy and ranking of failure predictions compared to baselines ,proof of concept and experimental code are shared in public repositories.,available as an open-source repository,"Heterogeneous Data Handling: The model must integrate different types of data , Accurately modeling time-series metric data is challenging, Dynamic Environments: Microservice architectures are dynamic, meaning that the failure dependency graph (HFDG) must be continuously updated to reflect changes in the system architecture." Expert system for automatic microservices identification using API similarity graph,Identification,sun2022_expert_system_identification,"Sun, Xiaoxiao and Boranbaev, Salamat and Han, Shicong and Wang, Huanqiang and Yu, Dongjin",2022,Journal, Expert Systems journal,ESJ,Wiley,Identification,microservices identification,"clustering to identify microservice candidates, sematic analysis",automated: the uuser should specify the number of clusters,candidate microservices,OpenAPI specification of the legacy system RESTful APIs,Domain Artifacts,API level: Each API is considered a node in the similarity graph.,open-source projects and one industry project.,sematic analysis that involve calculating TF-IDF scores for the topics and determining response message similarities. ,"K-means clustering, tf_idf",Classical ML,unsupervised,"Features include candidate topics, response message types, and API descriptions","precision, recall, and accuracy. Calinski-Harabasz Index is used to evaluate clustering performance.","Classification and Prediction , Clustering","precision, recall, and accuracy. Calinski-Harabasz Index.",The identified microservices are compared to the ground truth microservices of the target applications to assess the quality of the decomposition.,compared to the ground truth,/,/,"The experiments were conducted on a machine with a 10-core Intel i9 CPU, 64 GB of RAM, and the system implemented in Java.","on a machine with a 10-core Intel i9 CPU, 64 GB of RAM, and the system implemented in Java.","The approach is compared to other interface-based decomposition methods (e.g., Baresi et al., 2017 and Al-Debagy & Martinek, 2020),","interface-based decomposition methods (e.g., Baresi et al., 2017 and Al-Debagy & Martinek, 2020), ",The approach is considered successful if it produces more accurate and consistent microservices decompositions with higher precision and recall compared to baseline methods.,higher precision and recall compared to baseline methods.,The code is available on GitHub,available on public repository,"The method relies on TF-IDF for similarity calculation, which can struggle when APIs use diff erent terms to represent the same concept. This can result in incorrect topic matching and affect the clustering results. Granularity: Identifying the appropriate granularity for microservices is complex. The clustering algorithm might generate services that are too fine-grained or too coarse, requiring further refinement." Microservices Backlog-A Genetic Programming Technique for Identification and Evaluation of Microservices From User Stories,Identification,vera2021_microservices_backlog_genetic ,"Vera-Rivera, FH and Puerto, E and Astudillo, H and Gaona, CM",2021,Journal,IEEE ACCESS,ACCESS,IEEE Xplore,Identification,identification of microservices granularity,Specify the granularity of each microservice from the user stories,"semi-automatic: After the genetic programming algorithm generates potential microservice decompositions, developers may need to review and fine-tune the resulting clusters to ensure they align with the system's business logic and domain knowledge. Input Preparation: The initial product backlog and user stories must be well structured, often requiring manual refinement by the development team.",decomposition of user stories into microservices,User stories in the product backlog,Model Artifacts,User stories,"real-world projects (e.g., Foristom Conferences) and state-of-the-art applications (Cargo Tracking, JPet Store).","semantic and static analysis: user stories are analyzed for dependencies(business logic, data flow, and invocations between operations.)",Genetic Programming (GP),Classical ML,Unsupervised learning via genetic programming. ,"user story dependencies, semantic similarity, coupling, cohesion, and complexity metrics.","coupling (CpT), cohesion (CohT), number of stories associated to themicroservice (WsicT), cognitive complexity (CxT), semantic similarity (SsT), and the granularity metric (Gm). metrics of complexity (P: story points), communication, performance and estimated development time ",Software Design,"coupling, cohesion, complexity, interface count, number of calls between microservices, and semantic similarity.","coupling, cohesion, complexity, interface count, number of calls between microservices, and semantic similarity.",,"three projects: Cargo Tracking, JPet Store, and Foristom Conferences.",,"The experiments used 1000 population size, 400 iterations, and 500 mutations for the genetic programming algorithm.",,"ompared to DDD, Service Cutter, MITIA, and Execution Traces",,"The approach successfully reduces coupling, complexity, and communication while ensuring higher cohesion and semantic similarity among user stories grouped in the same microservice.",,no mention of publicly available code,No,Scalability: The approach works well for small to medium-sized backlogs but may face scalability issues for very large product backlogs.Data Quality: The quality of the decomposition depends on the completeness and accuracy of user stories and their dependency information. Graph-Reinforcement-Learning-Based Dependency-Aware Microservice Deployment in Edge Computing,Deployment,lv2024_graph_rl_deployment,"Lv, WK and Yang, PF and Zheng, TY and Lin, CM and Wang, ZY and Deng, MW and Wang, Q",2024,Journal,IEEE INTERNET OF THINGS JOURNAL,IOTJ,IEEE Xplore,Deployment,"optimizing microservice deployment to minimize latency and resource conflicts, while satisfying QoS constraints.","Extract graph-structured data features from the microservices’ call graphs, and predict an optimal deployment strategy.",Fully automated,optimal deployment strategies,"Call graphs representing microservice dependencies, including information about service instances, conflicts, and QoS constraints: request response time",Runtime Artifacts,Microservices and method-level dependencies between services,synthetically generated call graphs and real-world microservice systems.,"The system models dependencies using adjacency matrices for microservices and service relationships, and resource utilization data for the servers. Then processed using GCN to produce feature vectors.",Graph Convolutional Network (GCN) and Deep Reinforcement Learning (DRL).,Reinforcement Learning,Unsupervised and Reinforcement learning ,"Features include node attributes (e.g., service instances, resource utilization) and topological information of the call graph (e.g., service dependencies).","QoS satisfaction, deployment overhead, and response time. Experiments also measure convergence rate and number of deployed containers.",System Behavior,"QoS satisfaction, deployment overhead, and response time. convergence rate and number of deployed containers.","Performance is validated through simulations, comparing the proposed method ",,,,"The system simulates an edge computing environment with 8 CPU cores and 16 GB memory per edge server, deploying microservices within Docker containers.",,"Compared to Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), ",,higher QoS satisfaction rates and lower resource consumption compared to baseline methods.,,no mention of publicly available code,No,"Scalability: As the number of services and call graphs increases, the complexity of the GCN and DRL models grows, making training more resource-intensive. Reward Function Tuning: Designing an effective reward function for DRL that balances QoS constraints and deployment overhead is difficult." BertHTLG: Graph-Based Microservice Anomaly Detection Through Sentence-Bert Enhancement,Monitoring,chen2023_bert_htlg_detection,"Chen, Lu and Dang, Qian and Chen, Mu and Sun, Biying and Du, Chunhui and Lu, Ziang",2023,Conference,International Conference on Web Information Systems and Applications.,WISA,Scopus,Monitoring,anomaly detection: identifying anomalous interactions between microservices.,"Sentence-BERT (SBERT), which is used to enhance the graph embeddings. The system also uses heterogeneous graph-based learning for representing microservice interactions and identifying anomalous patterns.",highly automated,set of detected anomalies in the microservice architecture,logs and trace data from microservices: interactions and performance metrics between microservices.,Runtime Artifacts,service-to-service interaction level,real-world microservice applications,"log parsing, extracting semantic information from the logs using Sentence-BERT embeddings, and constructing a heterogeneous graph of service interactions based on the logs and traces.",Sentence-BERT (SBERT) for embedding the log data and a graph neural network (GNN) applied to the heterogeneous temporal-log graph (HTLG).,Graph Based ML,unsupervised,"Features include the semantic meaning of log entries,, interaction patterns, and service-to-service dependency data.","precision, recall, and F1-score to measure the effectiveness of anomaly detection.",Classification and Prediction,"precision, recall, and F1-score ","case studies of detected anomalies, comparing the results to ground-truth known anomalies in microservice systems.",comparing the results to ground-truth known anomalies in microservice systems.,The system is compared to baseline methods for anomaly detection in microservices,compared to baseline methods ,using large-scale microservice systems,using large-scale microservice systems,log-based anomaly detection: DeepLog (2017) and LogAnomaly (2019). graph-based anomaly detection: iForest (Isolation Forest),"3 approaches: DeepLog (2017), LogAnomaly (2019),iForest (Isolation Forest)",detect anomalies with higher accuracy and lower false positive rates compared to baseline methods.,higher accuracy and lower false positive rates ,No available tool,No available tool,"Scaling the approach to handle large-scale microservice systems with many nodes and interactions can be computationally expensive, especially given the complexity of Sentence-BERT and graph neural networks., The system has to deal with both structured and unstructured data (logs, traces) and construct a coherent heterogeneous graph from this data, which is a non-trivial task." MIRAS: Model-based reinforcement learning for microservice resource allocation over scientific workflows,Deployment,yang2019_miras_resource_allocation_workflows ,"Yang, Z. and Nguyen, P. and Jin, H. and Nahrstedt, K.",2019,Conference,Proceedings - International Conference on Distributed Computing Systems,ICDCS,IEEE Xplore,Deployment,resource allocation,predicts the system’s performance,Fully automated,resource allocation policies,"task queue information, task dependencies, task-level metrics (such as processing delay and work-in-progress), and resource allocation data","Runtime Artifacts, Model Artifacts",task level (tasks in workflows) and microservice level (number of consumers allocated).,real-world scientific workflow systems ,collecting workflow logs and aggregating data on task dependencies and consumer load.,Deep Reinforcement Learning (DRL) ,Reinforcement Learning,Reinforcement Learning,"WIP represents the workload on each microservice, and resource allocation","work-in-progress (WIP), average delay of workflows, and response time.",System Behavior,"work-in-progress (WIP), average delay of workflows, and response time.",no,no,Comparing other models based on the resource utilization and response time.,Comparision to other models,"The system was tested on a cluster of Google Cloud virtual machines, with a 3-layer predictive model for MSD and 1-layer model for LIGO.",Google Cloud virtual machines,"DDPG (model-free RL), DRS (Dynamic Resource Scheduling), and MONAD,",,lower response times and better long-term returns on resource allocation.,lower response times and better resource allocation.,no mention of publicly available code,no mention of publicly available code,"Data Scarcity: One challenge is training the model with limited real-world data, especially when the input space is large. Overfitting: In complex workflows (like LIGO), there is a risk of overfitting the predictive model due to insufficient training data" Improving Industry 4.0 Readiness: Monolith Application Refactoring using Graph Attention Networks,Identification,rathod2023_industry4_refactoring,"Rathod, T. and Joseph, C.T. and Martin, J.P.",2023,Workshop,"Proceedings - 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing Workshops, CCGridW 2023",CCGridW,IEEE Xplore,Identification,microservice candidate identification,make clustering decisions and outlier detection.,automated: number of microservices,"The output is a set of clusters: microservice candidates, along with detected outliers ",source code ,Source Artifacts,"class level,","four publicly available web-based applications: DayTrader, PBW, DietApp, and Acme-Air.","static analysis: converting the monolithic application's source code into a graph where nodes represent classes, and edges represent relationships (e.g., method invocations). And features : entry points and inheritance relationships.",Graph Attention Networks (GATs) with both single-head and multi-head attention mechanisms.,Graph Based ML,Unsupervised learning,"method invocation data, inheritance relationships, and execution traces.",Structural Modularity: cohesiveness and coupling. Non-Extreme Distribution (NED). Interface Number (IFN). microservice.Interaction Number (IRN).,Software Design,Structural Modularity: cohesiveness and coupling. Non-Extreme Distribution (NED). Interface Number (IFN). microservice.Interaction Number (IRN).,with the metrics and comparision,metrics and comparision,/,/,The experiments use four publicly available applications and compare the performance of the proposed single-headed and multi-headed GAT models.,The experiments use four applications and compare the performance with two approaches,Node2vec and CO-GCN,Node2vec and CO-GCN,"improved modularity, interface management, and loose coupling compared to traditional methods like Node2vec and CO-GCN.","improved modularity, interface management, and loose coupling compared to traditional methods like Node2vec and CO-GCN.","The tool is implemented as a research prototype, with no mention of public availability.",No public availability.,"Scalability: As monolithic applications grow in size, scaling the GAT model to handle larger graphs with complex dependencies becomes challenging.Clustering Quality: Balancing the trade-off between tight clustering (high cohesion) and loose coupling between microservices requires fine-tuning the model's hyperparameters and loss functions." The Application of ChatGPT for Identification of Microservices,Identification,stojanovic2023application ,"Stojanovic, Tatjana, and Saša D. Lazarević",2023,Conference,E-business technologies conference proceedings ,EBT,IEEE Xplore,Identification,microservices identification,microservices identification,Automated: require describtion from the user,Microservices names and discribtion,Discribtion of the system,Domain Artifacts,Microservices,Open source,Structering prompt,LLM-chatgpt,Deep Learning,unsupervised,text,qualitative evaluation by expert,Developer-Centric,By expert,by expert,by expert,/,/,prompt to chatgpt,prompt to chatgpt,no comparision,no comparision,provide solutions which make logical sense ,provide solutions which make logical sense ,not available,No tool,- Fuzzy Reinforcement Learning based Microservice Allocation in Cloud Computing Environments,Deployment,joseph2019_fuzzy_rl_allocation,C. T. Joseph and J. P. Martin and K. Chandrasekaran and A. Kandasamy,2019,Conference,TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON),TENCON,IEEE Xplore,Deployment,allocation microservices onto hosts ,optimize microservice placement based on server utilization and energy consumption.,High automation; the algorithm autonomously learns the best allocation policy.,ressource allocation plan,"Resource utilization metrics such as CPU usage, memory, number of processing elements (PE), and incoming microservice requests.",Runtime Artifacts,Host-level and service-level metrics,"Open source, real world Google Cluster Trace dataset.","Metrics are collected and categorized into classes for CPU utilization, ranging from low to very high ",Fuzzy Q-Learning,Classical ML,Reinforcement learning,"Features include CPU utilization, number of CPU cores requested, and memory requirements.","Energy consumption, Service Level Agreement (SLA) violation rate, and SLA degradation",System Behavior,"Energy consumption, SLA violation rate, and SLA degradation",-,-,Google cluster trace.,Google cluster trace.,Simulated environment using the ContainerCloudSim toolkit with real Google cluster trace.,Simulated environment ,compared against the First-Fit and Most-Full placement policies.,First-Fit and Most-Full placement policies.,Reduced energy consumption and lower SLA violations compared to baseline methods.,Reduced energy consumption and lower SLA violations ,Not available,Not available,Determining appropriate thresholds for state transitions in fuzzy logic. Balance between energy efficiency and SLA compliance: Ensuring both goals are achieved simultaneously without significant trade-offs. ChainsFormer: A Chain Latency-Aware Resource Provisioning Approach for Microservices Cluster,Deployment,song2023_chainsformer_latency_aware_provisioning,"Song, C. and Xu, M. and Ye, K. and Wu, H. and Gill, S.S. and Buyya, R. and Xu, C.",2023,Journal,Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),Lecture Notes in Computer Science,Scopus,Deployment,resource provisioning for microservices in a cluster based,predict optimal resource allocations based on chain latency.,Fully automated,"resource allocation decisions (e.g., CPU, memory per microservice) ","microservice performance logs(CPU utilization, memory usage, and disk I/O operations), latency metrics, and resource usage data (CPU, memory).",Runtime Artifacts,microservice level,real-world microservice clusters,collecting logs related to resource usage and latency for each microservice in the chain and converting this data into structured features: Feature Extraction then Time-Series Aggregation,Reinforcement Learning (RL): Deep Q-Learning (DQL) ,Reinforcement Learning,Reinforcement learning,"resource usage metrics (CPU, memory) and latency for each microservice in the chain, QoS metrics","average chain latency, resource utilization (CPU, memory), and QoS satisfaction. ",System Behavior,"average chain latency, resource utilization, and QoS satisfaction. ","Performance is assessed through simulations in a real-world microservice cluster,","Performance is assessed through simulations in a real-world microservice cluster,",/,/,"The system is tested in a microservice cluster with multiple nodes, and resource allocation is dynamically adjusted based on real-time feedback from the environment.",,traditional static resource provisioning approaches: CoScal and FIRM,CoScal and FIRM,"Success is measured by improvements in latency, resource usage, and the system's ability to handle dynamic workloads effectively.","improvements in latency, resource usage, and ability to handle dynamic workloads effectively.",Not available,Not available,"Scalability: Training the RL agent for large-scale microservice chains with complex interdependencies can be resource-intensive and requires efficient exploration strategies.Reward Function Design: The reward function must balance between reducing chain latency and minimizing resource usage, which can be challenging to optimize." Learning-Based Microservice Placement and Migration for Multi-Access Edge Computing,Deployment,ray2023_microservice_placement_edge,K. Ray and A. Banerjee and N. C. Narendra,2023,Journal,IEEE Transactions on Network and Service Management,TNSM,IEEE Xplore,Deployment,microservice placement,Reinforcement Learning (RL) and Learning Automata (LA) are used to optimize prefetching and migration strategies.,High degree ,ressource allocation,Servers: service radius and capacity. A set of applications (workflow of microservices) and mobility traces,"Technical Artifacts, Runtime Artifacts",Per-microservice instance ,Real-world mobility traces (San Francisco Taxi Dataset) and microservice benchmarks from the DeathStarBench suite.,User mobility and microservice invocation patterns are modeled and combined with MEC server capacities.,Reinforcement Learning (RL) and Learning Automata (LA).,Reinforcement Learning,RL and unsupervised,"User mobility, MEC server capacity, and microservice workflows.","Latency, server resource utilization, and percentage of successfully allocated microservices",System Behavior,"Latency, server resource utilization, and accuracy",-,-,"DeathStarBench microservices, San Francisco taxi dataset for mobility.",DeathStarBench,Simulation-based environment with MEC servers and mobile users.,Simulation-based environment with MEC servers and mobile users.,Compared against on-demand placement and the MCAPP-IM algorithm.,On-demand placement and the MCAPP-IM algorithm.,Lower latency and higher successful microservice allocation compared to reactive methods.,Lower latency and higher successful microservice allocation compared to reactive methods.,Not available,Not available,Adapting to rapidly changing conditions like user movement and varying server capacities can lead to slower learning convergence. Managing large state spaces as microservice workflows and user mobility increase complexity. GMA: Graph Multi-agent Microservice Autoscaling Algorithm in Edge-Cloud Environment,Monitoring,tong2023_gma_autoscaling_edge_cloud,G. Tong and C. Meng and S. Song and M. Pan and Y. Yu,2023,Conference,2023 IEEE International Conference on Web Services (ICWS),ICWS,IEEE Xplore,Monitoring,microservice autoscaling,state prediction and collaboration between edge and cloud servers.,High automation,microservice autoscaling,Microservices application with their edge and cloud servers. ,"Runtime Artifacts, Technical Artifacts",Per-server and per-microservice instance,Real-world traffic data from Istio Bookinfo application (used as a benchmark) and constructed synthetic traffic datasets.,"metrics, workload characteristics, and network states. transformed into graph-structured data for GCNs.",Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for multi-agent RL and Graph Convolutional Networks (GCNs) for state prediction and collaboration.,Reinforcement Learning,RL and unsupervised,"Features include server states (CPU/memory usage), network traffic, and microservice performance metrics (latency, SLA violations).","Average Waiting Time (AWT), SLA violation rate, P99 latency, and standard deviation of autoscaling performance.",System Behavior,"AWT, SLA violation rate, P99 latency, and standard deviation.",-,-,Istio Bookinfo application running across eight servers,Istio Bookinfo application running across eight servers,"Conducted on eight virtual machines with 24 cores, 32GB memory, and Docker/Kubernetes for containerized microservices.","8 VM, memory, and Docker/Kubernetes.","SARSA, DQN, and Decision Tree algorithms.","SARSA, DQN, and Decision Tree algorithms.","Better performance in terms of lower latency, fewer SLA violations, and stable autoscaling process.","Better performance in terms of lower latency, fewer SLA violations, and stable autoscaling process.",Not available,Not available,Ensuring the algorithm can scale efficiently with increasing user traffic and microservice instances. Topology-Aware Self-Adaptive Resource Provisioning for Microservices,Deployment,zeng2023_topology_adaptive_provisioning ,H. Zeng and T. Wang and A. Li and Y. Wu and H. Wu and W. Zhang,2023,Conference,2023 IEEE International Conference on Web Services (ICWS),ICWS,IEEE Xplore,Deployment,Resource allocation for microservices,Graph Neural Networks (GNNs) for resource and correlation feature extraction; Reinforcement Learning (RL) for resource allocation.,Fully automated resource allocation using a self-adaptive model.,ressource allocation plan,"application runtime metrics: resource metrics (like CPU usage, memory) and network metrics (like throughput and latency)",Runtime Artifacts,Microservices,"benchmark applications (TrainTicket, SocialNetwork, MediaService) Open source (e.g., Alibaba trace data),","Topological extraction, correlation analysis between microservices, resource, and QoS metrics",Graph Neural Networks (GNN) and Deep Deterministic Policy Gradient (DDPG) a type of RL,Graph Based ML,Reinforcement Learning (RL) and graph-based learning (unsupervised),"Microservice resource metrics (CPU, memory) and network metrics (throughput, latency).","Accuracy of critical microservice detection, resource utilization, and end-to-end latency.","System Behavior, Classification and Prediction",accuracy,Comparison with other methods: SVM-RL and CNN-RL.,Comparison with other methods : SVM-RL and CNN-RL.,"TrainTicket, SocialNetwork, and MediaService benchmarks.","TrainTicket, SocialNetwork, and MediaService benchmarks.","Intel i9-10900X CPUs, 256G RAM, virtual machines using AWS instances.","Intel CPUs, 256G RAM, virtual machines using AWS instances.",SVM-RL and CNN-RL: in resource provisioning and critical microservice detection.,SVM-RL and CNN-RL.,"Improved accuracy in detecting critical microservices (up to 7.22% over CNN-RL), reduced end-to-end latency by 22%, and increased resource utilization by 18%.","Improved accuracy in detecting, reduced end-to-end latency and increased resource utilization .",not available,not available,"Heterogeneity in microservices' resource requirements, fluctuating workloads, and topological complexity. Threats to validity include varying benchmark sizes and limitations in real-world applicability of benchmarks. Ensuring the model scales effectively with larger systems and topologies is a challenge due to increased complexity." Trace Anomaly Detection for Microservice Systems via Graph-based Semi-supervised Learning,Monitoring,ding2024_trace_anomaly_detection ,"Ding, S and Yuepeng, E and Zhang, J and Li, L and ...",2024,Conference,International Conference on Computer Supported Cooperative Work in Design,CSCWD,IEEE Xplore,Monitoring,detection of trace anomalies in microservice systems by building service invocation graphs from traces and applying semi-supervised learning to identify abnormal behaviors based on graph representations of trace data.,"Graph Neural Network (GNN), specifically a Message-Passing Neural Network (MPNN), to learn the latent structure of service invocation graphs. It then employs the Deep Semi-supervised Anomaly Detection (DeepSAD) algorithm to identify anomalies by learning a hypersphere that encloses normal data points, flagging outliers as anomalous traces.","The entire process from parsing traces, constructing service invocation graphs, training the model, and detecting anomalies is automated. The system uses limited labeled data to achieve effective anomaly detection.","trace anomaly detection score, identifying abnormal traces based on their distance from the center of the hypersphere. Anomalous traces are flagged when they fall outside the hypersphere.","trace logs from microservice systems, which contain multimodal information such as invocation sequences, response times, and service topologies.",Runtime Artifacts,"The method operates at the trace and span levels. Each trace consists of multiple spans representing different service invocations, which are modeled as nodes in a service invocation graph.","evaluated using data from the TrainTicket microservice system deployed on Kubernetes. The dataset includes 234,665 traces, with 21,695 anomalous traces injected using ChaosMesh for fault injection","The traces are parsed to create service invocation graphs, where nodes represent service instances, and edges represent invocation relationships. Each node is represented by a 300-dimensional vector embedding, generated using FastText. Edges are enriched with metrics such as processing time, network latency, and response status codes​",Message-Passing Neural Network (MPNN) to learn node representations in the graph. It combines this with the DeepSAD algorithm to detect anomalies by minimizing the volume of a hypersphere that encloses normal data points​,Deep Learning,The method applies semi-supervised learning. A small portion of labeled anomaly data is used along with a larger volume of unlabeled normal data. The DeepSAD model helps map normal data points inside a hypersphere and identifies anomalies outside it.,"Features include service names, operation names, processing time, network latency, duration time, response status codes, and invocation patterns (synchronous/asynchronous). These features are embedded and used as node and edge attributes in the service invocation graph​","precision, recall, and F1-score. TraceGSAD achieves a precision of 96.8%, recall of 94.1%, and an F1-score of 95.4%, outperforming other baseline methods​",Classification and Prediction,"precision, recall, and F1-score","The model outperforms several baseline methods (Multimodal-LSTM, TraceAnomaly, PUTraceAD, and SupervisedTraceAD) due to its ability to combine graph structures with invocation sequences and performance metrics. It also handles both synchronous and asynchronous invocations better than other methods​",outperformed other baseline approaches,"compared against Multimodal-LSTM, TraceAnomaly, PUTraceAD, and SupervisedTraceAD. TraceGSAD achieved the highest F1-score and precision across the benchmarks​",compared with 5 techniques ,"Experiments were conducted on a Kubernetes cluster using the TrainTicket system, with faults injected via ChaosMeshto generate anomaly data. The environment included an Intel Xeon CPU and a V100 GPU​",Kubernetes,"The method was compared against Multimodal-LSTM, TraceAnomaly, PUTraceAD, and SupervisedTraceAD. It significantly outperformed these approaches in terms of precision and robustness to noisy training data​",outperformes other approaches,"Success is defined by the model’s ability to accurately detect anomalies in traces, with a focus on achieving high precision, recall, and F1-score, while minimizing false positives​",ability to accurately detect anomalies in traces,poc,Not available,"The combination of GNN and Transformer layers increases the model's complexity, which could impact efficiency and scalability." Microservice anomaly detection based on tracing data using semi-supervised learning ,Monitoring,li2021_tracing_data_anomaly_detection,"Li, M. and Tang, D. and Wen, Z. and others",2021,Conference,International Conference on Artificial Intelligence and Big Data,ICAIBD,IEEE Xplore,Monitoring,anomaly detection and root cause localization in microservice systems using tracing data. It leverages a semi-supervised learning model to detect microservice failures and locate the root causes by analyzing service call chains​,"semi-supervised learning to train features using both labeled and unlabeled data from tracing logs and performance indicators. The model extracts features from tracing data and performance monitoring metrics to detect anomalies in real time, allowing for fault diagnosis with minimal labeled data","he system automatically processes tracing data, extracts features using semi-supervised learning, and dynamically detects anomalies using sliding windows. It automates root cause localization by mapping anomalies to the causal relationships between services​","anomaly detection results and a list of suspected root causes for identified failures. The system provides time-period-specific anomaly flags, service tags, and associated trace information​","tracing data (e.g., trace-ID, parent-ID, child-ID) and performance monitoring indicators (e.g., CPU usage, network delay) collected from microservice systems​",Runtime Artifacts,trace level (entire service call chains) and span level (individual service invocations). Both tracing data and performance metrics are analyzed to detect anomalies at different levels​,"AIOps Challenge (2020), which provides public data on a microservice system, including both normal operations and fault-injected failures​","feature extraction from time-series tracing data. The features include time-period segmentation, low-value filtering to reduce noise, and clustering of abnormal points based on deviation from normal behavior​",semi-supervised clustering to extract time-period-specific features and sliding window methods for detecting anomalies. It also uses deviation degree calculations to cluster abnormal features and identify root causes​,Classical ML,"semi-supervised learning, which allows the model to learn from both labeled and unlabeled data. The system is initially trained on normal data, and a small portion of manually labeled fault data is used to improve anomaly detection accuracy​","low-value features (to filter noise), time-period features (to capture temporal patterns), mutation features (to detect sharp changes), and abnormal features (trained from fault-labeled data). Empirical features, such as CPU usage thresholds, are also considered​","evaluated using accuracy and root cause localization accuracy. It achieved an overall anomaly detection accuracy of 99% and a root cause location accuracy of 98.5%, indicating high performance in both detecting failures and identifying their sources​",Classification and Prediction,root cause localization accuracy,"The model outperforms traditional methods by combining both tracing data and performance metrics. It uses semi-supervised learning to handle dynamic system behaviors and reduce false positives, particularly when system performance exhibits jittery behavior​",outperformed traditional methods,"benchmarked against a public dataset (AIOps Challenge 2020), achieving high accuracy in detecting anomalies and localizing root causes. It outperformed other models that rely solely on performance metrics or fully labeled data​",better performance than other models that rely solely on performance metrics or fully labeled data​,"The experiments were conducted using public tracing data from the AIOps Challenge, which provided both normal and fault-labeled data. The system was tested on a daily dataset of 1.2 GB of microservice data​",1 big OS system,No mention,No mention,ability to accurately detect anomalies (99% accuracy) and identify root causes (98.5% accuracy) using minimal labeled data. The ability to handle jittery time-period data and reduce false positives further validates its effectiveness​,"ability to accurately detect anomalies, and root cause",poc,Not available,"The reliance on semi-supervised learning helps mitigate the need for extensive labeling, but the model's performance depends on the availability of accurate fault-labeled data for initial training​" ,Identification,morais2021_ontology_microservices ,"Morais, Gabriel and Bork, Dominik and Adda, Mehdi",2021,Conference,Proceedings of the 13th international conference on management of digital ecosystems,MEDES,ACM,Identification,"identifying, comparing, and modeling microservices architectures",predict similarity scores based on extracted features between microservices.,"Semi-automated, since the Stardog Similarity Model simplifies and accelerates similarity detection but requires proper feature extraction and input configuration.",The result generated is a similarity score between microservices,"functional descriptions, technological attributes, and relationships between services such as interactions and dependencies.",Runtime Artifacts,microservice level,use cases with 25 microservice,"the cleaning of data to improve understanding, as performed in the manual analysis phase",Stardog Similarity Model,Classical ML,semi-supervised,"functional attributes, protocols, interactions, and coevolutions of microservices.",closeness of results to manual expert analysis.,Developer-Centric,closeness of results to manual expert analysis.,COmpared to manual expert analysis. ,manual expert analysis. ,Ground truth,Ground truth,/,/,EdgeSim and manual analysis,EdgeSim and manual analysis,The similarity identification is considered successful if it closely matches expert analysis in terms of functional and technological features.,closely matches expert analysis ,Not available,Not available,The challenge of generalizing similarity metrics across different microservices architectures. Learning predictive autoscaling policies for cloud-hosted microservices using trace-driven modeling,Deployment,abdullah2019_autoscaling_policies,"Abdullah, M and Iqbal, W and Erradi, A and ...",2019,Conference,IEEE International Conference on Cloud Computing Technology and Science (CloudCom),CloudCom,IEEE Xplore,Deployment,"The entire process, including data extraction (through stress testing to gather performance metrics), data analysis (by building a response time model), and simulation for autoscaling decision-making, is automated. The automation primarily focuses on generating workload-specific predictive autoscaling policies to minimize SLO violations while reducing the time and cost for gathering performance traces.","Machine Learning is applied during the data analysis and modeling stages to learn and predict autoscaling policies. ML models (e.g., Decision Tree Regressor) are trained using data collected from trace-driven simulations to forecast workload and required resources. These models predict the number of VMs needed to satisfy response time SLOs for different workloads, including synthetic and real-world data.","The system achieves a high degree of automation by dynamically adjusting resources based on workload forecasts and response time requirements. The ML model predicts the number of VMs required, and the system automatically scales out or scales in resources accordingly, minimizing manual intervention.","The primary output is a predictive model for resource provisioning, which predicts the necessary resources to satisfy response time requirements for future workloads. ","The input data includes performance logs gathered through stress testing. These logs record metrics like response time and VM usage. Additionally, code representing the microservice (a CPU-bound application in PHP) and workload generators are used to simulate concurrent requests.",Source Artifacts,"The data granularity includes time intervals (one minute), request count, and 95th percentile response time. These metrics are recorded continuously across each interval to capture system performance at a detailed level.","Industrial: Real-world workloads like Wikipedia and World Cup data, which represent realistic load patterns. Synthetic: Created by the authors, including a linearly increasing and periodic workload. Generated by System: Stress testing of the microservice on AWS generates logs specific to system performance under load.","Data Collection: Logs are collected by running stress tests and autoscaling simulations, Trace-Driven Simulation: These traces (logs) are fed into a simulated environment to model and learn predictive policies, Filtering: Only intervals meeting the SLO criteria are used for model training, removing data that shows SLO violations.",regression techniques for predictive modeling. Specifically: Decision Tree Regressor (DTR) is highlighted as the best-performing model for predicting resource needs.,Classical ML,"supervised learning. Models are trained on labeled data gathered from trace-driven simulations, where input features (e.g., request count) correspond to output labels (e.g., number of VMs required).",Number of requests per time interval.95th percentile of response time,"The evaluation relies on metrics such as total processed requests, response time SLO violations, and resource costs. The model’s performance is assessed based on its ability to reduce SLO violations and minimize costs, compared to a reactive autoscaling baseline.",System Behavior,"processed requests, response time SLO violations, and resource costs.","The qualitative assessment involves observing SLO adherence and cost efficiency of predictive autoscaling under various workloads. Additionally, visual comparisons are made using graphs that show the effectiveness of predictive scaling in maintaining lower response times and reducing SLO violations.",SLO adherence and cost efficiency of predictive autoscaling,"comparisons with a baseline reactive autoscaling method. The predictive approach is measured against the reactive method in terms of response time adherence, cost, and processed requests across four different workloads (e.g., Wikipedia, World Cup, periodic, and increasing workloads).","baseline reactive autoscaling method (Wikipedia, World Cup, Increasing, and Periodic workloads.)","The experiments were conducted on Amazon EC2 cloud infrastructure using c5.large and c5.9xlarge instances, focusing on CPU usage and response time under different simulated workload patterns. Nginx is used as a load balancer, and httperf for generating concurrent user requests, simulating realistic conditions without GPU involvement.",Amazon EC2 cloud infrastructure,compared to a baseline reactive autoscaling approach.,compared to a baseline reactive autoscaling approach.,Success is measured by reducing SLO violations and improving cost efficiency.,Reducing SLO violations and improving cost efficiency.," proof of concept (PoC) designed for research purposes. The components used, such as Amazon EC2 instances for stress testing and simulation, are commercially available but not necessarily a single open-source tool or package. The document does not mention any specific open-source libraries for ML but uses general ML models.",Not available,"Challenges include ensuring generalizability, accurate forecasting, and managing training costs, with future work focused on real-time model adaptation." High-performance computing-enabled probabilistic framework for migration from monolithic to microservices architecture using genetic algorithms,Pre-migration,alshammari2023_genetic_hpc_migration,"Alshammari, A and Almadhor, A and Qasem, SN and Alkhateeb, JH and Amjad, K",2023,Journal,SOFT COMPUTING,SOFT COMPUTING,Scopus,Pre-migration,analyzing challenges and predicting success rates of migration from monolithic to microservices architecture (MSA),"The framework integrates Naive Bayes and Logistic Regression models to predict success likelihood, and Genetic Algorithm (GA) to optimize the migration process.",automated: GA params are specified by the user,"success prediction and an optimized cost model,","project characteristics, success factors, and historical migration data from surveys.",Domain Artifacts,High-level system factors: challenges,surveys and expert interviews,"designing and structuring survey data, refining questionnaires, and conducting interviews.","Naive Bayes and Logistic Regression models to predict success likelihood, and Genetic Algorithm (GA) to optimize the migration process.",Classical ML,"Supervised learning, with models trained on historical project data to predict migration success.","migration challenges, project complexity, team expertise, and resource availability.",success rates and cost reduction.,Developer-Centric,,urveys and expert feedback were used to validate the predictions.,,,,Data was gathered from interviews with ten experts and processed through predictive models and GA.,,no comparision,,"improvement in success rates from 46% to 99%, with a reduction in costs by 6.1% in Naive Bayes and 15.4% in Logistic Regression ",,Not available,Not available,Specifiying the initial conditions for the Genetic Algorithm (GA): Microegrcl: An edge-attention-based graph neural network approach for root cause localization in microservice systems,Monitoring,chen2022_edge_attention_localization,"Chen, R and Ren, J and Wang, L and Pu, Y and Yang, K and ...",2022,Conference,International Conference on Service-Oriented Computing.,ICSOC,Compendex,Monitoring,"Automated root cause localization for microservices, using a graph neural network (GNN) to identify the origin of system anomalies within service call graphs.","The GNN model is central to the approach, employed for prediction and classification tasks. The model identifies fault nodes by analyzing node and edge features in the service call graph, using edge-based attention to improve accuracy.","The approach dynamically constructs a service call graph and uses a trained GNN model to automatically infer root causes in real-time, minimizing manual intervention.","The result is a fault localization vector, ranking nodes by their likelihood of being the root cause of the fault.","Data includes service logs and call metrics derived from system performance monitoring, particularly for node and edge features in service interactions.",Runtime Artifacts,"Node-level and edge-level metrics in the form of matrices, representing each node (service) and its inter-service relationships.","Data is collected from synthetic microservice applications (e.g., Sock-shop benchmark) running in a controlled environment and injected anomalies for simulation.","Involves construction of the service call graph using adjacency matrices for node relationships, edge feature matrices for call metrics, and normalization of features before feeding them into the model.",Graph Neural Network (GNN) with an edge-feature-enhanced attention mechanism.,Graph Based ML,"Supervised learning for training the GNN model on labeled anomaly data, enabling classification of fault nodes.",The model uses node metrics (resource utilization and network status) and edge metrics (response time and TCP send count).,"Precision at K (P@K), measuring the probability that the root cause ranks within the top K predictions.",Classification and Prediction,Precision at K (P@K),Qualitative comparisons with existing methods to demonstrate improved localization accuracy for various types of anomalies.,improved localization accuracy,"Comparisons with Random Walk, GraphSAGE, and MicroRCA for localization accuracy.","Random Walk, GraphSAGE, and MicroRCA","Experiments were conducted on a Kubernetes cluster with simulated workloads, using Chaos-mesh for fault injection and Prometheus for metric collection.",Kubernetes,"he method was compared against Random Walk, GraphSAGE, and MicroRCA models.","Random Walk, GraphSAGE, and MicroRCA","Success is measured by improved localization accuracy and precision compared to baseline methods, with the proposed method achieving an average P@1 accuracy above 80%.",improved localization accuracy, proof of concept (PoC) developed specifically for this research.,Not available,"As the size of the microservice system increases, maintaining localization accuracy becomes more challenging. Results are based on controlled environments with synthetic applications; performance in real-world settings may vary.Results are based on controlled environments with synthetic applications; performance in real-world settings may vary." Detecting Security and Privacy Risks in Microservices End-to-End Communication Using Neural Networks,Monitoring,chou2021_security_privacy_nn,J. Chou and E. Al-Masri and S. Kanzhelev and H. Fattah,2021,Conference,2021 IEEE 4th International Conference on Knowledge Innovation and Invention (ICKII),ICKII,IEEE Xplore,Monitoring,Detection and classification of privacy and security risks in the communication between microservices.,detect and classify risks in the communication paths between microservices.,heighly automated,classified risks,"Service communication traces, including Quality of Service (QoS) metrics like response time, availability, reliability",Domain Artifacts,microservice communication path l,"QWS Dataset, which provides real-world web service metrics, and tracing data collected using OpenTelemetry.","Involves cleaning, normalization, and computation of risk scores for each node using using QoS attributes",Fully Connected Neural Networks (FCNN).,Classical ML,Supervised learning with training data from the QWS dataset.,"QoS attributes such as response time, availability, reliability, and compliance.",Classification rate (CR) of correctly classified service ,Classification and Prediction,Classification rate ,Accuracy of risk classification using different configurations of the neural network.,using different configurations of the neural network.,"Uses the QWS dataset,",Uses the QWS dataset,"Google Colaboratory with an Intel Xeon CPU, ",Google Colaboratory with an Intel Xeon CPU.,no comparision,no comparision,Achieves a 96.55% classification rate ,heigh accuracy,Not available,Not available,"Feature importance: Determining the most useful features from QoS attributes for accurate risk detection. Data variability: Ensuring consistent performance across different service paths and communication traces. Model tuning: Identifying the right combination of hyperparameters (hidden layers, neurons) for optimal performance." Enhancing fault localization in microservices systems through span-level using graph convolutional networks,Monitoring,kong2024_fault_localization_span,"Kong, H and Li, T and Ge, J and Zhang, L and Li, L",2024,Journal,Automated Software Engineering,ASE,Compendex,Monitoring,"Automated fault localization at the span level, utilizing graph convolutional networks (GCN) to detect anomalies in microservices by analyzing directed graphs of trace data and inter-service invocation relationships.","GCN is employed for anomaly detection and fault localization within the graph representation of microservices, specifically learning edge representations to identify abnormal invocation relationships.","SpanGraph constructs graphs from trace data and uses a pre-trained GCN model to continuously detect faults without manual intervention, even adapting to incremental data in real-time scenarios.",The primary output is the localization of faulty spans (edges in the graph) and identification of the root-cause service node associated with the fault.,"SpanGraph uses trace logs, monitoring metrics, and configuration files to build directed graphs representing microservice interactions.","Source Artifacts, Runtime Artifacts","Graphs are constructed at the span level for individual invocation requests, including node and edge details.","Datasets are from open-source microservices benchmarks such as SockShop and TrainTicket, with faults injected for evaluation.","The process includes trace parsing, filling missing values, and normalizing features, followed by graph construction where nodes and edges are connected based on temporal relationships.",Graph Convolutional Network (GCN) with edge representation learning.,Graph Based ML,"Supervised learning, training on labeled trace data for edge and node classification.","Node features include execution metrics (e.g., execution time, memory usage) and trace-specific attributes; edge features cover temporal and resource usage metrics.","accuracy, precision, recall, and F1-score",Classification and Prediction,"accuracy, precision, recall, and F1-score","Qualitative comparison of SpanGraph’s anomaly detection and fault localization against baseline methods, demonstrating improved accuracy in identifying faults.",comparison against baseline methods,"SpanGraph is compared to multiple baselines (e.g., Random Forest, KNN, MLP, TLCluster), achieving higher F1-scores on both datasets.",Higher F1score,Experiments are conducted on a Kubernetes-based cluster with 15 VMs; faults are injected in SockShop and TrainTicket microservices for testing.,Kubernetes,"Methods include MEPFL (RF, KNN, MLP variants) and TLCluster, with SpanGraph showing superior performance.","RF, KNN, MLP variants) and TLCluster","Success is defined by higher precision, recall, and F1-score, with SpanGraph achieving over 12% improvement in F1-score for SockShop and 8% for TrainTicket datasets.",Higher F1 score/precision/recall,PoC,Not available,"The study explores a limited range of faults, which may not fully generalize to all real-world conditions." Minimize Resource Cost for Containerized Microservices Under SLO via ML-Enhanced Layered Queueing Network Optimization,Monitoring,luan2024_resource_optimization ,S. Luan and H. Shen,2024,Conference,"2024 14th International Conference on Cloud Computing, Data Science & Engineering (Confluence)",Confluence,IEEE Xplore,Monitoring,Optimizing CPU resource allocation for microservices while meeting Service Level Objectives (SLO).,DNN is integrated into performance prediction models for microservices by enhancing Layered Queueing Network (LQN) optimization.,"High, the system automatically adjusts CPU allocations using DNN-enhanced predictions.",ressource allocation plan,"Microservice runtime metrics (CPU, response time, throughput) and system configuration. using 800 users workloads","Source Artifacts, Runtime Artifacts",Per microservice instance.,simulated microservice-based application created by authors,"workload is collected from users then A matrix is built where rows represent data points (e.g., different time slots or configurations), and columns represent features (CPU, requests, response time).",Deep Neural Networks (DNNs) are used to predict microservice performance ,Deep Learning,Supervised learning,"Metrics like CPU usage, response time, and throughput.","Response time, CPU utilization, accuracy of prediction","Classification and Prediction , System Behavior","Response time, CPU utilization, accuracy of prediction",-,-,"Simulated application with seven nodes, tested under workloads between 300 and 800 users.",-,Windows OS with 3.80 GHz 8-Core AMD 5800X processor and 16GB RAM.,-,Compared with two baseline methods (Equal Number Allocation (ENA) and Graph Neural Network-based GRAF),ENA and GRAF,The proposed method reduced CPU costs by 30% to 50% compared to ENA and GRAF.,Reduced CPU costs,Not available,Not available,The high computational complexity of existing LQN models.The dynamic and fluctuating nature of microservice workloads.Ensuring accurate predictions despite the complex interdependencies between microservices.Scaling across large workloads while maintaining the accuracy of ML predictions. Cdascaler: a cost-effective dynamic autoscaling approach for containerized microservices,Deployment,shafi2024_cdascaler_autoscaling,"Shafi, N and Abdullah, M and Iqbal, W and Erradi, A and Bukhari, F",2024,Journal,CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS,Cluster Computing,Compendex,Deployment,resource allocation and autoscaling,predict the required number of CPU cores and containers based on incoming workload.,automated: needs to specify some threasholds,,Performance traces,Runtime Artifacts,container level and microservice level,"WorldCup, Wikipedia, ClarkNet, Calgary, and NASA: real-world workloads",Preprocessing includes removing traces that exceed response time thresholds and tuning for accurate scaling.,"Linear regression, polynomial regression, Decision Trees, Random Forest, and Multilayer Perceptron (MLP).",Classical ML,Supervised learning,"CPU usage, response time, container count, and request rates.","processed requests, response time, cost, and SLA violations.",System Behavior,"requests, RT, cost, and SLA.",-,-,five real-time workloads.,five real-time workloads.,"Conducted on a Kubernetes cluster with three machines (Core-i7, 8-GB RAM, 2-TB hard disk).", Kubernetes cluster with three machines ,"Evaluated against different container autoscaling techniques C-React, C-Proc, and RC-Proc methods ","C-React, C-Proc, and RC-Proc ",Achieved 40-60% reduction in SLA violations and cost reductions compared to baseline methods.,40-60% reduction in SLA violations and cost reductions,Not available,Not available,"Scalability: Ensuring scalability and performance as workloads vary significantly across services.Handling bursty workloads: Managing highly dynamic, fluctuating workloads in real-time without degrading performance." Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices,Monitoring,gan2019_seer_performance_debugging ,Yu Gan and Yanqi Zhang and Kelvin Hu and Dailun Cheng and Yuan He and Meghna Pancholi and Christina Delimitrou,2019,Conference,Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems,ASPLOS,ACM,Monitoring,"detection and avoidance of Quality of Service (QoS) violations in microservice-based cloud systems. It uses deep learning to analyze large-scale tracing data, predict upcoming QoS violations, and suggest corrective actions to maintain performance​","deep learning, particularly a hybrid CNN-LSTM model, to recognize spatial and temporal patterns in tracing data. It analyzes the dependencies between microservices and predicts potential QoS violations. Seer also uses per-node hardware monitoring data to diagnose root causes of performance degradation​","Seer automates performance monitoring, anomaly prediction, and diagnosis by using deep learning models trained on massive amounts of tracing data. It also proactively signals QoS violations and recommends actions to prevent them​","Seer outputs predicted QoS violations, highlighting the specific microservices likely to cause the issue, and offers recommendations for performance adjustment to avoid degradation. It generates alerts for system operators before a violation occurs​","The input consists of distributed traces collected at the RPC-level, along with detailed hardware monitoring data such as CPU, memory, network bandwidth, and cache utilization​",Runtime Artifacts,"microservice level and node level, analyzing metrics for each microservice and the entire cluster, as well as identifying bottlenecks and performance issues at the hardware resource level​","The data comes from tracing systems (e.g., based on Apache Thrift) in cloud environments, capturing interactions between microservices, as well as hardware performance counters to monitor node-level metrics​","queue length estimation, latency aggregation, and trace collection at multiple levels (RPC-level and node-level). Data is also filtered to remove noisy or irrelevant traces before being used to train the neural networks​","hybrid CNN-LSTM model. The CNN layers handle spatial dependencies between microservices, while the LSTM layers detect temporal patterns that lead to QoS violations. It employs a SoftMax layer for classifying microservices likely to",Deep Learning,"supervised learning, where it trains on past traces annotated with QoS violations. It learns both spatial and temporal patterns that correlate with performance bottlenecks​","queue depths, latency, and resource utilization (CPU, memory, bandwidth, etc.) for each microservice. These features are critical for predicting which microservice is responsible for upcoming performance issues​","accuracy in predicting QoS violations, achieving a 93% detection accuracy and 84% avoidance of QoS violations in real-time systems​",Classification and Prediction,accuracy in predicting QoS violations,"Seer significantly reduces the time to detect and prevent QoS violations, outperforming traditional utilization-based methods. It helps developers identify design flaws in microservices and provides proactive solutions to avoid performance bottlenecks​",outperforming traditional utilization-based methods,"benchmarked in both small-scale (20-server) and large-scale (100-server) deployments. In large-scale environments (Google Cloud and Microsoft Azure), Seer demonstrated high accuracy and scalability, reducing inference time by 200xusing TPUs and",small (20 severs) and large (100 server) deployments,local clusters (20 servers) and large-scale public cloud clusters (Google Cloud and Azure). The system was tested with real user traffic in a social network service with 582 registered users and 165 daily active users​,1 public cluster (Google Cloud and Azure) and 1 local ,"Seer outperforms utilization-based debugging methods and alternative tracing systems like Dapper and Zipkin, offering higher accuracy in detecting root causes of QoS violations and better scalability",Dapper and Zipkin,"ability to anticipate QoS violations with high accuracy, identify root causes, and recommend corrective actions before violations occur. It achieves 90.6% accuracy in detecting upcoming violations and helps avoid 84% of potential QoS violations​","anticipate QoS violations, identify root causes",research prototype,Not available,Handling massive datasets (up to 1TB of trace data) for training requires significant computational resources. Seer relies on hardware accelerators like TPUs and FPGAs to ensure scalability as the size of the cloud infrastructure grows​ Deepstitch: Deep Learning for Cross-Layer Stitching in Microservices,Monitoring,li2021_deepstitch_cross_layer,Richard Li and Min Du and Hyunseok Chang and Sarit Mukherjee and Eric Eide,2021,Conference,Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds,WOC,ACM,Monitoring,"cross-layer stitching of tracing information between the application layer (distributed tracing) and the kernel layer (system call tracing) in microservices. It learns how system calls map to application-level operations, enabling more fine-grained performance diagnostics without requiring modifications to tracing tools​",Long Short-Term Memory (LSTM) neural networks to learn the patterns of system call sequences across multiple services. It combines these learned patterns with application-level tracing information to stitch together system calls with their corresponding higher-level service interactions,he system fully automates the process of stitching application and kernel-level traces. It dynamically learns the global system call sequence patterns and applies this knowledge without needing manual intervention or modifications to the underlying tracing tools​,"mapped sequence that stitches together system call traces with application-level spans. This stitched trace provides a detailed view of the system's operation, including thread-level interactions and potential sources of performance degradation​","application-level distributed traces (e.g., spans and traces from tools like Jaeger) and kernel-level system call traces, which are combined to create cross-layer insights into system behavior​",Runtime Artifacts,Deepstitch operates at both the system call level (capturing detailed kernel operations) and the trace/span level(representing higher-level application interactions)​,"The system was evaluated using a real-world e-commerce microservices application (Sock Shop), which provided distributed tracing data along with system call traces collected using tools like Jaeger and vltrace",removing idle sequences from the system call traces by training idle models for each service. This helps isolate active system calls that correspond to application-level operations​,"Deepstitch employs LSTM networks to learn patterns in system call sequences across services. These learned models are used to match system call traces to application-level spans, thus enabling cross-layer stitching without manual intervention​",Deep Learning,"supervised learning during the model training phase, where it learns from system call sequences generated during isolated test runs. This allows Deepstitch to create a model that can identify system calls relevant to specific application operations​","system call types, timestamps, and thread information, which are used to distinguish between active and idle system calls. These features help the LSTM model learn patterns in system call behavior across services​","The evaluation metrics include stitching accuracy, which measures how well the system correctly associates system calls with application-level spans. For example, the system achieved 91.38% accuracy for the ""GET /orders"" trace​",Classification and Prediction,stitching accuracy,"Deepstitch demonstrates significant improvements over traditional methods, particularly in microburst troubleshootingand performance tuning. By enabling cross-layer stitching without modifying existing tracing tools, it offers a flexible and non-intrusive solution​",improvement in microburst troubleshooting and performance tuning,"The system was tested on the Sock Shop microservices application running on three servers, each equipped with multi-core CPUs, and demonstrated high accuracy in cross-layer stitching across multiple traces",1 system running on 3 servers,"deployed on three Dell PowerEdge R430 servers running a Kubernetes cluster, with tracing data collected using Jaegerfor application-level traces and vltrace for system calls. The models were trained and evaluated using NVIDIA GPUs for accelerated learning​",3 Dell PowerEdge R430 servers,"Deepstitch outperforms traditional cross-layer stitching methods that rely on manual modifications to tracing tools or kernel libraries. It eliminates the need for custom kernel modules or application-layer modifications, making it more scalable and easier to deploy",outperforms traditional cross-layer stitching methods,"ability to accurately stitch system call traces with application-level spans across multiple services, without requiring any modifications to existing tracing tools or kernel layers​",accurately stitch system call traces with application-level spans,poc,Not available,"The LSTM models used by Deepstitch are trained on specific microservice architectures, so retraining may be necessary when applied to different systems or architectures. Ensuring the model generalizes well across various microservice patterns is a challenge" Automatic Migration-Enabled Dynamic Resource Management for Containerized Workload,Deployment,khan2023_dynamic_resource_management,"Khan, Saad Ahmad and Abdullah, Muhammad and Iqbal, Waheed and Butt, Muhammad Arif and Bukhari, Faisal and Hassan, Saeed-Ul",2023,Journal,IEEE Systems Journal,Syst. J.,IEEE Xplore,Deployment,container placement and migration to reduce the number of active Physical machines.,predicts job execution times to allocate CPU resources dynamically; AND clusters jobs based on their predicted execution time.,heighly automated,"resource allocation plan,","containerized workloads: CPU pin/core requirements, completion times, and energy consumption metrics.",Runtime Artifacts,job level,simulations,"Historical data of job execution is collected for training the prediction model, and execution time is predicted for allocating CPU ",deep learning (DNN) for execution time prediction and K-means clustering for job partitioning.,Deep Learning,"Supervised learning is used for job execution time prediction, and unsupervised learning (K-means) is used for clustering jobs.",The primary feature for clustering is estimated job execution time​,"energy consumption (kW/h), job completion time, and percentage of free PMs during job execution",System Behavior,"energy consumption (kW/h), job completion time, and percentage of free physical machines",Energy efficiency and performance,Energy efficiency and performance,two different data center setups (small and large scale).,two different data center setups (small and large scale).,based on Docker Swarm,based on Docker Swarm,Sweep and WoM,Sweep and WoM,lower energy consumption (up to 2.35× improvement) and reduced migration counts,lower energy consumption (up to 2.35× improvement) and reduced migration counts,Not available,Not available,"extending the system to handle containers requiring resources beyond CPU (e.g., memory, bandwidth) and integrating more complex ML methods for other types of workload" BSDG: Anomaly Detection of Microservice Trace Based on Dual Graph Convolutional Neural Network,Monitoring,shi2022_bsdg_anomaly_detection,"Shi, KZ and Li, J and Liu, YC and Chang, YZ and Li, XY",2022,Conference,SERVICE-ORIENTED COMPUTING (ICSOC 2022),ICSOC,Compendex,Monitoring,The system monitors microservices and detects anomalies in their behavior.,detecting anomalies in microservice traces,Heighly automated,detetcted anomalies,microservice traces: the communication and operation states of microservices in a distributed environment.,Runtime Artifacts,traces: microservice call level,"collected by simulating fault injection, real production data from a large service provider and constructed by the authors","cleaning, normalization, and converting traces into graph representations",Dual Graph Convolutional Networks (DGCNs).,Graph Based ML,"Supervised learning, where traces are labeled as normal or anomalous.","communication patterns, service call latency, and error rates between services.","Precision, Recall, and F1-Score",Classification and Prediction,"Precision, Recall, and F1-Score",/,/,"The datasets (TTFI, AIOps2020, and MPFI) were used for benchmarking.",same dataset,The experiments were conducted on these datasets,Dataset,Autoencoders and threshold-based anomaly detection techniques.,Autoencoders and threshold-based anomaly detection techniques.,The model outperforms baselines in terms of precision and recall in detecting microservice anomalies.,Heigher precision and recall,Not available,Not available,Incomplete or noisy trace data can affect the accuracy of the model. Unsupervised learning approach for web application auto-decomposition into microservices,Identification,abdullah2019_unsupervised_web_decomposition ,"Abdullah, Muhammad and Iqbal, Waheed and Erradi, Abdelkarim",2019,Journal,Journal of Systems and Software,JSS,ScienceDirect,Identification,identification of microservices,group URIs into clusters based on their resource consumption,heighly automated,microservices clusters,"Access logs from the web application (which include document size, response time, and URI information).",Runtime Artifacts,URI level,"real-world web, historical access logs collected from the monolithic web application.","numeric feature-based approach to extract relevant attributes such as URI, document size, and response time",Scale Weighted K-Means (SWK) for clustering URIs based on resource consumption: with a specific weight matrix used to prioritize response time over document size when determining cluster centers.,Classical ML,unsupervised,The key features are document size and response time for each URI in the application logs.,"response time, number of processed requests, SLO violations, and cost.",System Behavior,"response time, number of processed requests, SLO violations, and cost.",compare auto-created microservices and manually created microservices to assess the effectiveness of the automated decomposition.,compare with manually created microservices ,manually created microservices and a baseline monolithic implementation.,manually created microservices and monolithic implementation.,"The experiments were conducted using AWS, deploying both the monolithic and microservices implementation",using AWS,compared to manually created microservices and the baseline monolithic version of the application., manually created microservices and the monolithic version ,"improved performance, scalability, and cost reduction","improved performance, scalability, and cost reduction",Not available,Not available,"Identifying Optimal Number of Clusters (k), Overlapping Clusters: Some URIs may fall into multiple clusters, " A new decomposition method for designing microservices,Identification,al_debagy2019_decomposition_method,"Al-Debagy, Omar and Martinek, Peter",2019,Journal,Periodica polytechnica Electrical engineering and computer science,Periodica polytechnica,Scopus,Identification,identification of microservices,Sematic analysis: generate embedding,semi-automated: manual fine-tuning for final service definition,cluster: identified microservice candidates,"OpenAPI specifications, describe REST APIs, including endpoints and operation names",Domain Artifacts,"API operation level, focusing on individual operation names.",research-focused system,extracting operation names from the OpenAPI specifications and converting them into word embeddings using fastText.,word embedding techniques: fastText,Classical ML,unsupervised,operation names,"precision, recall, and F-Measure.",Classification and Prediction,"precision, recall, and F-Measure.",The system is evaluated qualitatively based on its ability to group semantically similar operations together into microservice candidates.,Theability to group semantically similar operations together ,-,-,The experimental setup is based on the OpenAPI specifications provided by the system's developers., OpenAPI specifications provided by the system's developers.,No comparision,No comparision,"ability to correctly identify and cluster operation names into meaningful microservice candidates, with a high precision and recall.",a high precision and recall.,Not available,Not available,The semantic similarity approach is limited by the quality and comprehensiveness of the OpenAPI specifications. Dependencies-based microservices decomposition method,Identification,al_debagy2021_dependencies_based_decomposition,"Al-Debagy, O and Martinek, P",2021,Journal,International Journal of Computers and Applications,IJCA,Scopus,Identification,decomposition of monolithic applications into microservices by creating a class dependency graph and using clustering algorithms to identify microservice candidates.,"No direct machine learning models are used. Instead, the approach focuses on graph clustering algorithms (like Leiden, Louvain) to detect microservice candidates by clustering related classes based on their dependencies.",The entire process from constructing the class dependency graph to clustering and generating microservices is automated.,"The output is a set of microservice candidates, identified by clustering classes with strong dependencies into potential microservices.","source code from the monolithic application, used to build a class dependency graph.",Source Artifacts,"The granularity is at the class level, where dependencies between classes are analyzed and used to form a graph.","The data comes from eight open-source Java applications used as test cases in the study, such as AcmeAir, SpringBlog, and FTGO.",Preprocessing includes extracting class dependencies from the monolithic source code and representing them as a weighted graph.,"No traditional ML models are used. The focus is on graph clustering techniques, such as Leiden and Louvain, to form microservice clusters.",Classical ML,"unsupervised learning through clustering, grouping classes into microservices based on their relationships.","dependency between classes, which is used to construct the edges of the graph.",F1 score and Newman Girvan Modularity (NGM) to assess the accuracy and quality of the decomposition.,Clustering,"NGM, F1-score","The method is compared against other approaches, showing promising results in terms of precision and structural organization.",3 other litterature approaches,"The method is benchmarked against 11 different clustering algorithms, such as Leiden, Louvain, and others, with Leiden performing best in terms of accuracy.",11 different clustering algo,"The approach is tested on eight Java applications, with comparisons across various clustering algorithms.",8 Java app,"The method is compared against existing decomposition methods like Nunes et al., Baresi et al., and Selmadji et al., showing better or comparable results in F1 score.",3 litterature approach,higher F1 scores (accuracy in detecting microservices) and higher NGM scores (structural quality of the microservices).,"higher F1 score, higher NGM",poc,Not available,"The method was tested only on small to medium applications, so performance on large applications remains untested. Some classes with fewer references may not be properly clustered, potentially missing some microservice candidates." A microservice decomposition method through using distributed representation of source code,Identification,al2021microservice,"Al-Debagy, O and Martinek, P",2021,Journal,Scalable Computing: Practice and Experience,ICI,Compendex,Identification,decomposition of monolithic applications into microservices by using code embeddings (generated via code2vec) to cluster semantically similar classes into microservices candidates.,"code2vec neural network model to generate distributed code embeddings from the monolithic application's source code. These embeddings capture semantic similarities between code snippets, enabling clustering of classes into microservice candidates.","The method automates the process of extracting methods, generating code embeddings, and clustering classes based on these embeddings to suggest microservices candidates. The whole process minimizes the need for manual intervention.","clusters of classes, which represent microservices candidates. These are evaluated based on cohesion at both the message and domain levels.","source code from the monolithic application. Methods are extracted from classes to generate code embeddings, which are then used for clustering.",Source Artifacts,"class and method levels, extracting methods from classes and converting these methods into vector representations (embeddings).","four benchmark applications, including JPetStore, SpringBlog, JForum, and Apache Roller, which are open-source monolithic Java applications.",extracting methods from classes and converting them into embeddings using code2vec. The embeddings are then aggregated to represent entire classes for clustering.,"code2vec to generate embeddings, and Affinity Propagation is applied as the clustering algorithm to group similar classes into microservice candidates.",Classical ML,unsupervised learning through clustering based on distributed code embeddings. No labeled data is required.,"code embeddings generated from the methods in the source code, which capture semantic similarities.",Cohesion at Message Level (CHM) and Cohesion at Domain Level (CHD) to assess how well the method groups related classes into cohesive microservices.,Software Design,"CHD, CHM","The method is compared against other decomposition methods (e.g., Jin et al., Saidani et al.) and shows superior results in terms of cohesion for most test cases.",compared with 2 litterature approaches (Jin et al Saidani et al),"higher cohesion metrics (CHM and CHD) in most cases, particularly for larger applications like JForum and Apache Roller.",2 litterature approaches,"The experiments were conducted using four Java applications: JPetStore, SpringBlog, JForum, and Apache Roller, with detailed metrics for comparison.",4 OS Java app,"compared to methods by Jin et al. and Saidani et al., showing better performance in cohesion metrics for most test cases.",4 litteratire approaches ,"method's ability to produce microservices with high cohesion, as indicated by better CHM and CHD scores compared to other methods.",High cohesion ,poc,Not available,The method's performance on larger applications requires further testing. Partial migration for re-architecting a cloud native monolithic application into microservices and {FaaS}. ,Identification,bajaj2020_partial_migration_cloud_native ,"Bajaj D, Bharti U, Goel A, Gupta {SC} ",2020,Conference,"In: International conference on information, communication and computing technology, pp 111--124",ICICCT,Compendex,Identification,partial migration of monolithic applications by identifying components that can be migrated to microservices or Function-as-a-Service (FaaS) using web access log data. It analyzes usage patterns and resource requirements to determine suitable migration paths for different services.,"K-means clustering (an unsupervised learning algorithm) to group Uniform Resource Identifiers (URIs) based on their frequency of use and resource consumption, helping identify which parts of the monolithic application should be migrated to microservices or FaaS.","The entire process of log preprocessing, URI clustering, and service model mapping is automated, providing recommendations for whether a service should remain in the monolith, be migrated to microservices, or be moved to a FaaS platform.","set of recommendations for migrating services to either microservices or FaaS, along with a visual clustering of URIs based on their scalability and resource requirements.","web access logs from the monolithic application, including parameters like request response time (RRT) and URI frequencies.",Runtime Artifacts,"URI level, where each URI represents a specific resource or function within the monolithic application.","The data used for evaluation is from a Teachers Feedback Web Application, a monolithic application developed for a case study in this research.","extracting relevant fields from the web access logs, such as URI and RRT, and standardizing the data before applying the clustering algorithm.","K-means clustering, an unsupervised learning technique used to partition URIs into clusters based on their frequency and resource usage.",Classical ML,"unsupervised, clustering the URIs based on usage patterns and resource consumption without prior labels.","URI frequency (indicating scalability needs) and Mean Request Response Time (MRRT), which reflects resource usage (e.g., CPU, disk, network).","URI frequency and MRRT to assess resource consumption and scalability needs for appropriate service model mapping (monolith, microservices, FaaS).",System Behavior,"MRRT, URI Frequency","The results are assessed qualitatively by demonstrating how different components of the monolithic Teachers Feedback Web Application are mapped to the three service models (monolith, microservices, FaaS).",how different components of the case study are mapped to the three service models,"The approach was validated against a real-world case study (Teachers Feedback Web Application), showing its effectiveness in identifying suitable migration paths for different components based on their usage patterns.",1 Real word case study,"The experiment was conducted on the Teachers Feedback Web Application, where logs from a complete usage cycle were collected and analyzed using K-means clustering.",logs from a complete usage cycle were collected and analyzed,The method is compared to traditional migration approaches that focus on complete migration to microservices or FaaS. This partial migration approach aims to optimize resource utilization by keeping some services in the monolith.,compared to traditional approach,"identifying services that can be migrated to microservices or FaaS, leading to optimized resource utilization and improved scalability without a full migration.",optimized resource utilization and improved scalability,poc,Not available,The success of the migration depends on accurate web access logs that reflect actual usage patterns. Dynamic and static feature-aware microservices decomposition via graph neural networks,Identification,chen2023_dynamic_static_features,"Chen, L and Guang, M and Wang, J and Yan, C",2023,Conference,"International Conference on Knowledge Science, Engineering and Management",KSEM,Compendex,Identification,decomposition of monolithic applications into microservices by leveraging both dynamic and static features of the system. It combines method call relations (dynamic) and class dependencies (static) to construct a graph representing the monolith for clustering into microservices.,"The method uses Variational Graph Autoencoders (VGAE) to learn the graph's structure and the nodes' features, improving clustering by embedding classes with similar functionalities into a latent space.",The entire process from extracting static and dynamic features to constructing a graph and clustering the classes into microservices is automated using graph-based deep learning techniques. The method eliminates the need for manual feature selection or graph construction.,"microservice candidates represented by clusters of classes from the monolithic system, which are grouped based on their static and dynamic relationships.","source code for static analysis and execution traces for dynamic analysis. The source code is used to determine class dependencies, while traces provide insights into method interactions.",Source Artifacts,class and method levels. It analyzes dependencies between classes (static features) and the interactions between methods during execution (dynamic features).,"four benchmark applications, including JPetStore, SpringBlog, JForum, and Apache Roller, which are open-source monolithic Java applications.","parsing execution traces to extract method call relations and analyzing class dependencies in the source code to build a graph. Additionally, Code2Vec is used to generate semantic vectors of code.","Variational Graph Autoencoders (VGAE) to learn embeddings for each class (node) in the graph, based on both dynamic and static features.",Graph Based ML,unsupervised learning to cluster classes based on their graph representations. No labeled data is required.,dynamic features (method call relations) and static features (class dependencies and semantic vectors of methods). These features are combined to construct a comprehensive graph representation of the monolith.,"uses cohesion and coupling metrics, specifically: CHM, CHD,OPN,iRN",Software Design,"CHM, CHD,OPN,iRN","The method is compared to other approaches (e.g., LIMBO, MEM, FoME, MSExtractor) and shows superior performance in terms of cohesion and reduced coupling. It achieves better clustering results by combining dynamic and static features.",compared with 4 approaches and outperformed,"benchmarked against LIMBO, MEM, FoME, and MSExtractor, with MDDS showing better results in cohesion (CHM and CHD) and comparable or improved results in coupling (OPN and IRN).",better results in cohesion (CHM and CHD) and comparable or improved results in coupling (OPN and IRN),"four Java-based applications: JPetstore, SpringBlog, JForum, and Roller. The dynamic analysis used Kieker for collecting execution traces, and static analysis relied on Code2Vec for generating code embeddings.",4 OS Java app,"compared against LIMBO, MEM, FoME, and MSExtractor, which use various static and dynamic approaches for microservice decomposition. MDDS outperforms these methods in several key metrics, particularly for large applications like Roller.",4 litteratire approaches ,"measured by the model’s ability to increase cohesion within the microservices and reduce coupling between services, which leads to better-structured and more efficient microservices.",increase cohesion and reduce coupling,poc,Not available,"Dynamic analysis might not fully cover all classes in the system, requiring manual intervention in cases where certain parts of the system lack sufficient execution trace data." Towards an Automatic Identification of Microservices from Business Processes,Identification,daoud2020_microservice_identification_business ,M. Daoud and A. El Mezouari and N. Faci and D. Benslimane and Z. Maamar and A. El Fazziki,2020,Conference,2020 IEEE 29th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE),WETICE,IEEE Xplore,Identification,identification of microservices,Collaborative clustering techniques are used to group activities based on dependency types,High automation,Microservices,Business process models represented in BPMN (Business Process Model and Notation) format.,Model Artifacts,Activity-level granularity,"real world, industrial data","Dependencies (control, semantic, and data) between activities are extracted, and dependency matrices are constructed",Collaborative Hierarchical Agglomerative Clustering (cHAC).,Classical ML,unsupervised,"Features include control execution order, data sharing between activities, and semantic similarity of activity names.",Dunn Index to measure clustering quality,Clustering,Dunn Index ,Dunn,Dunn,Bicing bike rental system with an increasing number of activities.,Bicing bike rental system ,"Implemented in Java, tested with real business processes from the Bicing system.",real world data,Collaborative clustering is compared to centralized clustering,centralized clustering,Higher Dunn Index values indicating better clustering results.,Higher Dunn Index,Not available,Not available,Performance: Ensuring the algorithm scales efficiently with an increasing number of business activities. Enabling Practical Cloud Performance Debugging with Unsupervised Learning,Monitoring,gan2022_practical_cloud_performance,"Gan, Yu and Liang, Mingyu and Dev, Sundar and Lo, David and Delimitrou, Christina",2022,Conference,Operating Systems Review (ACM),OSR,ACM,Monitoring,automates root cause analysis of performance issues in cloud microservices. It identifies performance degradation and automatically adjusts the resource allocation of microservices to prevent further Quality of Service (QoS) violations​,"Causal Bayesian Networks (CBN) to capture dependencies between microservices, and Graphical Variational Autoencoders (GVAE) to generate counterfactual scenarios that assess the impact of specific microservices on end-to-end performance​","Sage automates performance monitoring, root cause analysis, and resource adjustments across cloud microservices. It functions without needing detailed instrumentation or invasive monitoring, making it highly scalable and practical for production environments​","root cause identification of QoS violations and dynamically adjusts resource allocations (e.g., CPU frequency, cache partitioning) to improve system performance and prevent future violations​","RPC-level distributed traces, container-level metrics, and hardware performance metrics (e.g., from tools like Jaeger, Prometheus, and cAdvisor)​",Runtime Artifacts,microservice level (monitoring services across tiers) and resource level (evaluating container and hardware performance)​,"Data is collected from live traffic traces using distributed tracing systems such as Jaeger, Prometheus, and cAdvisor, without any need for application or kernel modifications​","collected and pre-processed from distributed traces and performance metrics. Metrics are normalized, and low-frequency traces are aggregated to reduce overhead and ensure high accuracy during runtime​",Sage uses Causal Bayesian Networks (CBN) to map dependencies and Graphical Variational Autoencoders (GVAE)to predict the impact of changing microservice performance on the overall system​,Graph Based ML,"unsupervised learning, eliminating the need for labeled data. The system trains on live, uninstrumented traffic and low-frequency traces​","latency propagation, container metrics, resource usage, and hardware-level performance indicators. These are used to identify performance bottlenecks and root causes​",evaluated on root cause identification accuracy and QoS violation reduction. It correctly identifies root causes in over 93% of cases and improves system performance predictability​,"System Behavior, Classification and Prediction",root cause identification accuracy and QoS violation reduction,Sage is more practical than traditional solutions due to its ability to function with low-frequency traces and without invasive instrumentation. It significantly reduces false positives and false negatives compared to other approaches​,more practical than traditional solutions,benchmarked against traditional autoscalers and systems like Seer. It achieved higher accuracy and better resource efficiency across multiple cloud environments​,3 litterature approaches ,"tested in both local clusters and large-scale clusters on GCP. It was evaluated with multiple microservice applications, including social networks and media services",Local and large scale cluster (GCP),"age was compared with methods like Seer, CauseInfer, and Microscope. It outperformed them by using unsupervised learning and avoiding heavy instrumentation and data labeling​",outperformed other approcahes,ability to accurately detect root causes of performance issues and improve QoS predictability without high overhead or resource costs​,accurately detect root causes of performance issues and improve QoS predictability,poc,Not available,ensuring the generalization of Causal Bayesian Networks and Graphical Variational Autoencoders across diverse microservice architectures without requiring extensive retraining or tuning​ Twin Graph-Based Anomaly Detection via Attentive Multi-Modal Learning for Microservice System,Monitoring,huang2023_twin_graph_detection ,"Huang, Jun and Yang, Yang and Yu, Hang and Li, Jianguo and Zheng, Xiao",2023,Conference,IEEE/ACM International Conference on Automated Software Engineering (ASE),ASE,Compendex,Monitoring,detection of anomalies in microservice interactions,detect anomalies in the system.,heighly automated: analyze graph structures and detect anomalous interactions automatically.,detected anomalies in the microservice system,"performance metrics (e.g., response time, resource usage) collected from the microservices",Runtime Artifacts,service level (representing interactions between services) and the instance level (representing individual microservice instances).,real-world microservice systems,Data Filtering to remove noise and irrelevant information and Parsing them to extract service-to-service interactions.,graph Neural Networks (GNNs) for analyzing the graphs and an attention mechanism for focusing on important interactions. ,Graph Based ML,unsupervised: multi-modal learning,"graph-based features (e.g., node centrality, edge weight) and time-series performance metrics (e.g., CPU usage, memory consumption, response time).","precision, recall, and F1-score for anomaly detection in microservice systems",Classification and Prediction,"precision, recall, and F1-score ",The anomalies detected by the system are qualitatively compared to known or manually labeled anomalies,compared to known or manually labeled anomalies,"Compared to several baseline anomaly detection methods, both graph-based and log-based.",Compared to several baseline methods,large-scale microservice systems,large-scale microservice systems,"SCWarn, PLELog, HADES,MSTGAD, TraceAnomaly, TranAD, USAD ","SCWarn, PLELog, HADES,MSTGAD, TraceAnomaly, TranAD, USAD ",detect anomalies with higher precision and better recall compared to baseline methods.,h higher precision and better recall ,Not available,Not available,Building both service-level and instance-level graphs and maintaining the connections between them in a dynamic microservice environment can be challenging. The system must scale to handle large-scale microservice systems with many instances and interactions. SCORE: A Resource-Efficient Microservice Orchestration Model Based on Spectral Clustering in Edge Computing,Deployment,li2022_score_resource_orchestration ,"Li, N and Tan, YS and Wang, XC and Li, B and Luo, J",2022,Conference,SERVICE-ORIENTED COMPUTING (ICSOC 2022),ICSOC,Compendex,Deployment,microservice orchestration,group microservices based on resource dependencies and communication patterns ,heighly automated,ressource allocation plan,"real-time data from microservices: CPU, memory, bandwidt",Runtime Artifacts,Microservice call -level,Real-time edge computing workloads simulated using CloudSim.,Data is collected from running workloads and transformed into matrices representing resource consumption and communication dependencies between microservices.,Spectral Clustering,Classical ML,Unsupervised,"Features include CPU, memory, and bandwidth usage, as well as the frequency and intensity of communication between microservices.","Latency, resource utilization (CPU, memory), and SLA violation rate.",System Behavior,"Latency, resource utilization (CPU, memory), and SLA violation rate.",assessing improvements in latency and resource efficiency.,assessing improvements in latency and resource efficiency.,"Simulated environment with up to 100 edge servers and 1,000 microservices.","100 edge servers and 1,000 microservices.","Experiments are conducted in a simulated edge-cloud environment using the CloudSim toolkit with 100 edge servers and 1,000 microservices.",CloudSim toolkit ,Comparison of microservice orchestration with and without spectral clustering,with and without spectral clustering,Reduced latency and better resource utilization compared to baseline methods.,Reduced latency and better resource utilization,Not available,Not available,"Cluster formation: Ensuring that the spectral clustering groups microservices effectively, balancing resource usage and minimizing communication overhead. Heterogeneous environment: Managing resource variations across different edge servers." Unsupervised Microservice Log Anomaly Detection Method Based on Graph Neural Network,Monitoring,liang2024_unsupervised_log_anomaly ,"Liang, X and Li, L and Peng, H",2024,Conference,International Conference on Swarm Intelligence,ICSI,Compendex,Monitoring,"microservice log anomaly detection by parsing logs, constructing log event graphs, and using graph-based deep learning techniques to identify anomalies in microservice systems.","Graph Neural Networks (GIN) to learn representations of log event graphs. This graph-based representation enables the model to capture structural, semantic, and temporal patterns within the logs for anomaly detection. Additionally, the DeepSVDD model is used for anomaly detection by learning hypersphere boundaries that enclose normal data, allowing deviations to be detected as anomalies.","The entire process is automated, from log parsing and graph construction to anomaly detection. The method uses unsupervised learning, eliminating the need for labeled data, and applies graph-based learning to detect anomalies based on log event patterns.","anomaly detection decision for log sequences, identifying which logs exhibit abnormal patterns. The method also provides a graph representation of log events, which can be visualized to interpret system behavior.","microservice logs containing event records, including TraceId, SpanId, and ParentSpanId, which are used to construct the log event graph.",Runtime Artifacts,"log event level, with each log event being a node in the graph. Dependencies between events (e.g., invocation relationships) are used to form edges between nodes.","The data used in the experiments comes from the TrainTicket microservice system dataset, which includes logs generated from microservices. The dataset contains 132,485 span events and 7,705,050 log events, with 23,334 anomaly log events​","The logs are parsed using an extended version of the Drain log parsing algorithm, which segments and processes the logs to extract key features, such as TraceId and SpanId, and groups logs from the same request. These logs are then transformed into a log event graph, with features extracted using word embeddings and the TF-IDF algorithm.","Graph Isomorphism Networks (GIN) to learn the structure and relationships within the log event graph. GIN is particularly effective for capturing topological information in graph-based data, making it suitable for representing microservice logs. The DeepSVDD model is employed to identify anomalies by learning compact hypersphere boundaries that enclose normal log sequences.",Graph Based ML,The method uses unsupervised learning. It learns from normal log sequences and identifies anomalies without the need for labeled anomaly data. The model applies the DeepSVDD framework to distinguish between normal and abnormal data based on learned patterns.,The model extracts both semantic features (via word embeddings and TF-IDF) and structural features (based on the log event graph topology). These features are used to learn node embeddings that capture the dependencies and patterns in the logs.,"Precision, Recall, and F1-score. The method shows superior performance, with a Precision of 98.6%, Recall of 97.1%, and an F1-score of 97.8%, outperforming baseline methods such as DeepLog, LogAnomaly, and DeepTraLog​",Classification and Prediction,"precision, recall, f1score","he method is qualitatively compared with existing approaches, demonstrating superior performance in detecting anomalies. The use of graph neural networks allows the model to capture complex dependencies and interleaved log events that are challenging for traditional methods.",superior performance in detecting anomalies,"The method was benchmarked against DeepLog, LogAnomaly, and DeepTraLog, showing significant improvements in both precision and recall, especially when dealing with complex log data from microservices","benchmarked against DeepLog, LogAnomaly, and DeepTraLog,","conducted using the TrainTicket microservice system dataset, which provides logs of service invocation and events. The dataset includes over 132,000 span events and more than 7.7 million log events, making it suitable for evaluating the scalability and accuracy of the model.",1 big dataset ,"compared to DeepLog, LogAnomaly, and DeepTraLog. It outperformed these methods due to its ability to capture the intricate dependencies between log events and its use of graph-based learning for anomaly detection.",3 litteratures approaches ,Success is defined by the model’s ability to detect anomalies accurately with high precision and recall. The method demonstrates improvements over existing approaches by achieving better F1-scores and handling complex log data in microservice environments.,detect anomalies accurately with high precision and recall.,poc,Not available,"While the model performs well on the TrainTicket dataset, its scalability to much larger systems with more complex logs remains untested." Method of Microservices Division for Complex Business Management System Based on Dual Clustering,Identification,liu2020_dual_clustering_microservices,"Liu, B and Lu, J and Zhang, F and Zhang, W and ...",2020,Conference,"5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE).",ICMCCE,IEEE Xplore,Identification,division of complex business management systems into microservices by analyzing tasks and using dual clustering methods. It aims to divide monolithic systems based on functional and data dependencies between tasks.,dual clustering algorithms for task division. Clustering is used to group tasks into potential microservices based on the relationships between data objects and operations within the system.,"The method automates the task decomposition and clustering process, minimizing the need for manual intervention in identifying microservice candidates from the complex monolithic system.","clusters of tasks, each representing a potential microservice. The tasks are grouped based on data object dependencies and operational relevance.",business tasks and data structures from the monolithic system. Each task is modeled and then vectorized for clustering based on its functional and data dependencies.,Model Artifacts,task level. Tasks are the fundamental units of the system that are analyzed for functional similarities and dependencies on shared data objects.,"The document mentions the application of the method in a real-world case study related to the shipbuilding collaborative development process, which provides task data for the clustering process.",task decomposition and modeling to extract relevant features from the tasks and data objects. The data is then vectorized for the clustering algorithms to process.,The dual clustering method uses two clustering algorithms—one based on data objects and the other based on operations. These algorithms work together to identify task groupings that represent microservice candidates.,Classical ML,"The method uses unsupervised learning, as the clustering of tasks into microservices is performed without labeled data. The clustering is based on task features extracted from the system.",data dependencies (based on shared data objects between tasks) and operational dependencies (based on the functional relationships between tasks).,"The document does not specify detailed quantitative metrics, but success is qualitatively measured by how effectively tasks are grouped into microservices with minimal coupling and high cohesion.",Clustering,/,The results are qualitatively compared based on how well the method identifies microservices that align with the functional and operational requirements of the system. The method shows improvements in microservice identification for complex systems like shipbuilding collaboration platforms.,MS that align with functional /operational requirements,"The method was tested on a real-world case study from the shipbuilding industry, which demonstrated the method's effectiveness in dividing complex tasks into microservices.",real-world system,"The system was applied to a ship collaborative development process, where tasks were analyzed and divided into microservices using dual clustering.",1 big real-world system analyzed,/,/,"Success is defined by the method's ability to identify microservices that are functionally cohesive and loosely coupled, reducing inter-service dependencies and aligning with the system’s operational requirements.",high cohesion and loosely coupled MS,poc,Not available, scalability for very large systems with thousands of tasks and complex data relationships remains untested. Migrating Monolith System to Microservices with Directed Graph Attention Neural Network,Identification,liu2024_migration_graph_nn ,"Liu, Jianwei and Zhang, Cheng",2024,Conference,Third International Conference on High Performance Computing and Communication Engineering (HPCCE 2023).,HPCCE,IEEE Xplore,Identification,decomposition of monolithic systems into microservices by leveraging directed graph neural networks. The DGANN model captures dependencies between classes and transforms the monolithic system into cohesive microservice candidates based on a clustering task​,"Directed Graph Attention Neural Network (DGANN), which uses an attention mechanism to prioritize certain class dependencies during the decomposition process. The DGANN model is designed to learn class dependencies and directionality in function calls, capturing both structural and functional relationships​","The system automates the process of class dependency extraction, graph creation, and clustering, reducing the manual effort required for migrating from a monolithic to a microservice architecture. The attention mechanism automates the learning of relationships between classes, making the process efficient​","microservice candidates, represented as clusters of classes that are functionally cohesive and loosely coupled. The final output is a set of service partitions, ready to be deployed as independent microservices​",class dependency graphs generated from the monolithic system. These graphs are built from dynamic execution trajectory data and test cases that describe the relationships between classes. Node attributes are also included to describe class-specific properties​,Source Artifacts,"class level, where each class is treated as a node in the graph, and method invocations between classes form the edges. The granularity ensures that the model captures detailed interactions between individual components​","dynamic execution trajectory data collected during system runtime via test cases. The method was validated using four open-source datasets: AcmeAir, Daytrader, Plants, and JPetStore","extracting class dependencies and creating the node-attribute matrix, which combines interface attributes, functional attributes, and functional co-occurrence attributes. These matrices are normalized before feeding into the graph network​"," DGANN model uses a Graph Attention Network (GAT) to learn the importance of different nodes in the graph, applying attention weights to class dependencies based on their significance. The network captures both the structural relationships between classes and the directionality of their interactions​",Graph Based ML,"unsupervised learning to perform node clustering, which is essential for the decomposition process. The model minimizes both structural and node attribute reconstruction losses to ensure that clusters correspond to cohesive microservices​","interface attributes, functional test unit attributes, and functional co-occurrence attributes. These features capture the relationships between classes and their roles in business logic, guiding the clustering process​","BCS (Functional Cohesion), ICP (Cross-partition Coupling), and SM (Cohesion-Coupling). The DGANN model outperforms other methods, achieving lower BCS and ICP scores, and higher SM scores on multiple datasets​",Software Design,"BCS (Functional Cohesion), ICP (Cross-partition Coupling), and SM (Cohesion-Coupling)",improvements in partitioning monolithic systems compared to existing methods like Mono2Micro and Deeply. The use of directed attention in capturing call dependencies results in more cohesive and decoupled microservice candidates​,improvement in identifying MS compared to 3 litterature approach,"benchmarked against methods such as Mono2Micro, Bunch, Co-GCN, and Deeply. DGANN achieved the best performance on key metrics like BCS and ICP across all four datasets​",4 litterature methods,"conducted on four open-source datasets: AcmeAir, Daytrader, Plants, and JPetStore. The system was evaluated on its ability to cluster classes into microservices using dynamic execution traces and test cases​","4 OS projects (AcmeAir, Daytrader, Plants, and JPetStore)","compared to several recent decomposition methods, including Mono2Micro, Bunch, Co-GCN, and Deeply. It outperformed these methods on most metrics, demonstrating superior ability to capture class dependencies and reduce coupling​",outperformed these methods on most metrics,ability to decompose monolithic systems into microservices with high functional cohesion and low inter-service coupling. DGANN consistently achieves better clustering results than existing methods across various datasets​,high functional cohesion and low inter-service coupling,poc,Not available,"Since DGANN relies on execution traces, the quality of the trace data directly impacts the model's performance. Incomplete or inaccurate traces may result in poor clustering and incorrect microservice candidates​" Visualization tool for designing microservices with the monolith-first approach.,Pre-migration,nakazawa2018_visualization_tool ,,2018,Conference, {IEEE} Working Conference on Software Visualization ({VISSOFT)},VISSOFT,IEEE Xplore,Pre-migration,"design of microservices by visualizing and clustering components of a monolithic application into microservices. It constructs a calling-context tree and performs clustering based on source code analysis and function calls, enabling developers to refine the design interactively.",clustering algorithms (semantic-based and calling-context-based) are used to initially group components into microservices.,"The tool automates the initial microservice design and visualization, providing developers with a starting point. Developers can manually refine the design through the visual interface based on suggestions, communication frequency, and source code information.","The tool produces a visual representation of microservice designs, showing how components (classes) are clustered and their communication levels, allowing for further refinement and optimization.","source code files from the monolithic application, along with profile data collected during a dry run of the application to generate the calling-context tree.","Source Artifacts, Runtime Artifacts","The tool operates at the class level, analyzing dependencies between classes and function calls.","from open-source web applications such as Acme Air and DayTrader, used as benchmarks for the case studies.","filtering library calls, creating a compacted calling-context tree, and tokenizing source code features for clustering.",k-means++ and CCT-based clustering are applied to group classes based on semantic similarity and function calls.,Classical ML,unsupervised clustering without labeled data to organize the monolith into microservices.,text content of source code (semantic clustering) and function call frequencies (CCT-based clustering).,amount of communication between microservices and the similarity to official microservice designs.,System Behavior,"communication, similarity",A qualitative comparison with the official microservice designs of Acme Air and DayTrader shows that the tool produces reasonable and effective microservice designs that can be refined further by developers.,comparison with 2 MS design,"compared against Mazlami's MST clustering and other methods, showing superior clustering in terms of performance (less communication between services).",compared with Mazlami's MST clustering,"two open-source web applications (Acme Air and DayTrader), profiling function calls with JPROF and testing various clustering methods.",2 OS web app,compared against Mazlami's MST-based clustering and a variation of the MST clustering that incorporates function call amounts.,compared with Mazlami's MST clustering,"tool's ability to reduce the amount of communication between microservices while maintaining appropriate service granularity, as well as its similarity to official microservice designs.",reduce the amount of communication,poc,Not available,"Initial clustering results require manual refinement, and the tool's recommendation system helps but does not fully automate this process,The tool's performance in handling larger and more complex monoliths remains to be tested in future work." Towards Migrating Legacy Software Systems to Microservice-based Architectures: a Data-Centric Process for Microservice Identification,Identification,romani2022_data_centric_identification ,Y. Romani and O. Tibermacine and C. Tibermacine,2022,Workshop,2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C),ICSA-C,IEEE Xplore,Identification,Identifying potential microservices,Clustering techniques like K-means and topic modeling are applied to group database entities (tables) into microservice candidates,"Semi-automatic; clustering helps identify microservice candidates, but developer intervention is required for validation and naming",Microservices,"Entity-Relationship diagrams, SQL/NoSQL tables",Source Artifacts,Table-level and column-level data are used for clustering.,"open source application: 63 classes, 14 interfaces, and 14 database tables.","Data is cleaned and prepared using NLP techniques like tokenization and lemmatization, and enriched with semantic relationships using lexical databases (WordNet/WordWeb).",K-means clustering,Classical ML,Unsupervised,"Table names, foreign keys, and semantic relationships","The quality of clustering(within cluster sequare distance), number of clusters, and developer validation ",Clustering,-,by human developers ,human developers ,"No specific benchmarks, but an illustrative example by one system",one system,Illustrative use case with a small monolithic system (Java-based Spring Boot e-commerce app).,a monolithic system,no comparision,no comparision,well-defined microservice candidates based on database schema clustering only,well-defined microservice ,Not available,Not available,Human intervention is needed to validate and refine the automatically generated clusters. Difficulty in handling large database schemas where table relationships are complex. A DDD Approach Towards Automatic Migration To Microservices,Identification,saidi2023_ddd_migration_microservices ,M. Saidi and A. Tissaoui and S. Faiz,2023,Conference,2023 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET),IC_ASET,IEEE Xplore,Identification,identification and decomposition ,group functionally related activities into potential microservices.,"Semi-automatic; clustering is automated, but validation and microservice refinement require developer input.",Microservices,Business processes and domain models,Source Artifacts,Activity-level granularity ,indusrial,structural and functional analysis for identifying shared attributes between activities and calculating dependencies between them using a Dependency Structure Matrix (DSM). ,K-means clustering,Classical ML,unsupervised clustering,"Features include shared attributes, read/write operations, and intra-domain and inter-domain dependencies."," functional cohesion, and low coupling between microservices.",Software Design,"cohesion, and coupling ",Human developers validate the generated microservice candidates for consistency and correctness.,Human developers ,The case study uses the Bicing system with 7 sub-domains,1 system,Applied to a real-world bike rental system for microservice identification.,,no comparision,no comparision,Effective identification of microservices with strong cohesion and low coupling.,Effective identification of microservices with strong cohesion and low coupling.,Not available,Not available,"Accuracy of clustering: Ensuring that the K-means clustering generates microservices with strong internal cohesion and weak external coupling.Manual validation: Even after clustering, developer intervention is required to validate and refine the identified microservices." A hierarchical dbscan method for extracting microservices from monolithic applications. ,Identification,sellami2022_hierarchical_dbscan_microservices,,2022,Conference,"In: Proceedings of the 26th international conference on evaluation and assessment in software engineering, pp 201--210",EASE,ACM,Identification,extraction of microservices from a monolithic application by clustering classes using hierarchical DBSCAN based on structural and semantic similarities. It helps identify potential microservices and outliers.,"The approach uses DBSCAN (a density-based clustering algorithm) to group classes into microservices, relying on structural and semantic similarity metrics between classes.","The process is fully automated from class similarity calculation to the hierarchical clustering of microservices, requiring minimal human intervention for customization.","set of candidate microservices derived from the monolith, represented in a hierarchical structure, along with any outlier classes that may need further consideration.","source code files from the monolithic application, which are analyzed to build a structural dependency graph and semantic analysis through class names, method names, and comments.",Source Artifacts,"The system operates at the class level, analyzing dependencies between classes and their interaction through method calls and shared domain concepts.","The data used in the evaluation comes from open-source Java applications, such as Spring PetClinic, Microservices Event Sourcing, and Kanban Board Demo.","extracting structural and semantic features from the source code. It builds a call graph and performs text preprocessing (like CamelCase splitting, stop-word removal, and stemming) for natural language processing (NLP) on class names and comments.",Hierarchical DBSCAN for clustering classes into microservices based on their similarity scores.,Classical ML,unsupervised clustering without the need for labeled training data.,structural similarity (based on method calls between classes) and semantic similarity (based on NLP analysis of class and method names).,"precision (how well the extracted microservices match the manually identified ones), structural modularity, interface number, and inter-call percentage to assess the quality of the decomposition.","Classification and Prediction , Software Design","precision, SM, IFN","The method is qualitatively compared with manually designed microservices, showing promising results with precision scores up to 0.9 in some cases.",compared with manually designed microservices,"compared against 5 baselines: Bunch, CoGCN, FoSCI, MEM, and Mono2Micro, outperforming them in several metrics like structural modularity and inter-call percentage.",5 litterature approches comparison,3 Java web app,3 Java web app,"compared to existing microservice extraction techniques such as Bunch and Mono2Micro, achieving better precision and quality in several cases.",Bunch and Mono2Micro,"higher precision, structural modularity, and reduced coupling between microservices,","higher precision, structural modularity, and reduced coupling between microservices,",poc,Not available,"The results depend on the tuning of hyperparameters (e.g., the neighborhood distance and minimum class size), which could affect the final outcome." Autoencoder-based Anomaly Detection in Microservices using Distributed Tracing,Monitoring,shahini2024_autoencoder_anomaly_detection,"Shahini, S and Momeni, H",2024,Conference, CSI International Symposium on Artificial Intelligence and Signal Processing (AISP),AISP,IEEE Xplore,Monitoring,detection of anomalies in microservice systems by analyzing distributed traces. It uses span and trace embeddings combined with a Convolutional Autoencoder (CAE) to identify unusual patterns and deviations from normal behavior in microservices,"Convolutional Autoencoder (CAE) to learn normal patterns in microservice traces. The CAE is trained to minimize reconstruction errors during the encoding and decoding of normal traces, and it flags any deviations (high reconstruction errors) as anomalies​","The model automates trace parsing, span embedding, trace embedding, and anomaly detection. The unsupervised learning process requires no labeled data and can detect anomalies without manual intervention once trained​","anomaly detection results based on the Mean Squared Error (MSE) of trace reconstructions. Traces with high reconstruction errors are classified as anomalies, providing insight into faulty service invocations​","distributed traces from microservices. Each trace is composed of multiple spans, which include metadata like service name, operation name, response time, and HTTP status code​",Runtime Artifacts,"The method operates at both the span and trace levels. Each trace is composed of multiple spans, with individual embeddings generated for each span. The final trace embedding is the aggregation of the individual span embeddings​","The model is evaluated on the TrainTicket dataset, a widely used benchmark for microservices research. The dataset contains 170,996 traces, including 16,915 anomalous traces generated using fault injection techniques","The preprocessing includes parsing spans from traces and generating three types of embeddings for each span: name embedding, time embedding, and response code embedding. These embeddings are aggregated to form the final trace embeddings​",The model applies a Convolutional Autoencoder (CAE) for learning representations of normal trace patterns. It uses the reconstruction errors from the CAE to flag anomalies. The CAE architecture includes convolutional layers followed by pooling and upsampling operations​,Deep Learning,"The model uses unsupervised learning. It trains on normal traces, learning to reconstruct them accurately. Anomalous traces are detected when the reconstruction error exceeds a threshold set based on the distribution of errors from normal traces​","name embeddings (based on SBERT), time embeddings (duration, waiting time, execution time, relative start time, link start time), and response code embeddings (one-hot encoded HTTP status codes). These features capture the key aspects of each span in the trace​","precision, recall, F1-score, specificity, and Matthews Correlation Coefficient (MCC). The model achieved a precision of 0.731, recall of 0.799, and F1-score of 0.764, outperforming baseline models like TraceAnomaly and",Classification and Prediction,"precision, recall, and F1-score, et MCC","The model significantly improves over existing methods, particularly in recall and F1-score, while reducing false positives. It outperforms TraceAnomaly by 37.75% in recall and 23.82% in F1-score, showcasing its robustness in handling a wide variety of faults​",outperforms TraceAnomaly,"The method is compared against TraceAnomaly and MultimodalTrace on the TrainTicket dataset. The AnoTraceAE model shows improvements across all metrics except recall compared to MultimodalTrace, with significant gains in precision and F1-score​",significant gains in precision and F1-score​,"The experiments were conducted on a system with Intel i7-CPU, 8GB RAM, running on Windows 10. The model was implemented using Python 3.9.7 and Keras 2.9.0​",_,"benchmarked against TraceAnomaly and MultimodalTrace, showing substantial improvements in precision, recall, F1-score, and MCC. It demonstrated better handling of complex trace data and reduced false positives​",outperformed TraceAnomaly and MultimodelTrace,"Success is defined by the model's ability to detect anomalies with high precision and recall, outperforming existing baseline methods while minimizing false positives. The improvements across all key metrics validate the model's effectiveness in real-world microservices environments",high precison and recall,poc,Not available,"The model's performance has only been tested on the TrainTicket dataset, and further evaluation is needed to verify its effectiveness across different microservice systems." Autonomous selection of the fault classification models for diagnosing microservice applications,Monitoring,song2024_autonomous_fault_classification,"Song, Yujia and Xin, Ruyue and Chen, Peng and Zhang, Rui and Chen, Juan and Zhao, Zhiming",2024,Journal,Future Generation Computer Systems,FGCS,Wiley,Monitoring,fault diagnosis of microservice applications.,selection of fault classification models for diagnosing microservice faults. ,heighly automated,a fault diagnosis that identifies the presence and type of faults in microservices,"system logs, performance metrics, and fault traces from microservice systems.",Runtime Artifacts,service level and trace level,"real world open source: real-world applications, including Sock-Shop and Train-Ticket datasets.","Preprocessing steps include log parsing, normalization of performance metrics, and feature extraction from service traces. The logs are structured into templates.",Deeplearning,Deep Learning,Unsupervised,"features are extracted from logs, traces, and system metrics (e.g., CPU, memory usage) for fault diagnosis.","macro-precision, macro-recall, and macro-F1 score to evaluate the fault diagnosis models. The root cause localization is evaluated using PR@k and Avg@k metrics.",Classification and Prediction,"macro-precision, macro-recall, and macro-F1, PR@k and Avg@k metrics.","The system’s robustness is assessed by comparing it to baseline models, with performance evaluated across various fault types (e.g., CPU hog, memory leak).","performance evaluated across various fault types (e.g., CPU hog, memory leak).","The paper compares the performance of the approach against several baselines, including PC-based and GES-based ",PC-based and GES-based ,The experiments are conducted using large datasets like Sock-Shop and Train-Ticket. The experiments use systems with an Intel Xeon Gold CPU and 64 GB RAM.,datasets: Sock-Shop and Train-Ticket. Intel Xeon Gold CPU and 64 GB RAM.,PC-based and GES-based ,PC-based and GES-based,"Success is measured by the system’s ability to outperform baselines in terms of macro-F1 score, precision, and recall.","outperform baselines in terms of macro-F1 score, precision, and recall.",Not available,not available,"handling unbalanced observational data, where certain fault types are underrepresented. " Monolith to Microservices: VAE-Based GNN Approach with Duplication Consideration - 2022 IEEE International Conference on Service-Oriented System Engineering (SOSE),Identification,sooksatra2022_vae_gnn_approach ,"Sooksatra, Korn and Maharjan, Rokin and Cerny, Tomas",2022,Conference, 2022 IEEE International Conference on Service-Oriented System Engineering (SOSE),SOSE,IEEE Xplore,Identification,"partitioning of monolithic applications into microservices, considering class dependencies and communication, while allowing for class duplication to reduce overhead in inter-microservice communication.","A variational autoencoder (VAE) is used to create embeddings from a feature matrix extracted from a dependency graph. The fuzzy c-means algorithm is applied for clustering classes into microservices, considering duplications.",The entire process from extracting dependencies and entry points to clustering classes into microservices is automated using machine learning models.,"a set of microservices that are derived from a monolithic application, optimized for reduced communication overhead and maintainability by allowing class duplication where necessary.",Input data includes dependency graphs and entry point matrices that represent class dependencies and co-occurrence paths from the monolithic application.,Source Artifacts,"The system operates at the class level, extracting relationships between individual classes within the monolith.","Data used in the experiments are from three monolith applications: Bearboard, Autocare, and Pharmacy, developed using the Spring framework.","The process includes building a dependency graph, generating entry point existence and co-existence matrices, and normalizing the data to create a feature matrix for the VAE.",The approach utilizes a Variational Autoencoder (VAE) for generating class embeddings and fuzzy c-means clusteringfor microservice identification.,Deep Learning,"The system applies unsupervised learning, as it clusters classes into microservices without predefined labels.","Features include class dependencies, entry point existence, and entry point co-existence, combined to form a feature matrix used by the VAE.","Structural Modularity (SM), Non-Extreme Distribution (NED), and Interface Number (IFN) to assess the quality of microservices partitioning.",Software Design,"SM, NED, IFN","The method is qualitatively compared with other models (e.g., AE-K and AE-C), showing that the VAE-based approach achieves better structural organization and reduces communication overhead in microservices.",better structural organization and reduces communication overhead in microservices.,"The approach is benchmarked against Autoencoder with k-means (AE-K) and Autoencoder with fuzzy c-means (AE-C), showing improvements in all evaluation metrics, particularly in reducing the number of interfaces and improving modularity.",k-means (AE-K) and Autoencoder with fuzzy c-means (AE-C),"Experiments were conducted using three monolith applications developed in Spring Framework. Various neural network architectures were tested for embedding generation, and clustering results were evaluated with and without duplication.",3 monolithic app,"The VAE-based approach is compared to state-of-the-art methods such as AE-K and AE-C, demonstrating superior performance in structural modularity and microservice quality.","AE-K, AE-C","measured by higher structural modularity, balanced class distribution, and reduced interface numbers, all of which are improved by the VAE-based approach compared to baselines.","higher structural modularity, balanced class distribution, and reduced interface numbers,",poc,Not available,"The method does not incorporate dynamic analysis (e.g., runtime data like document size and response time), which could improve the partitioning for performance reasons, The approach has been tested on a limited number of applications, and future work aims to include dynamic data and real-world use cases for further validation." Tracegra: A Trace-Based Anomaly Detection for Microservice Using Graph Deep Learning,Monitoring,chen2022_tracegra_anomaly_detection ,"Chen, Jian and Liu, Fagui and Zhong, Guoxiang and Jiang, Jun and Xu, Dishi and Tan, Zhuanglun and Shi, Shangsong",2022,Journal,Computer Communications,Computer Communications,Web of Science,Monitoring,detecting anomalies in microservice trace data,detect response time anomalies and invocation path anomalies ,automated: need to specify threshold,anomaly scores for each trace,trace logs and performance metrics from microservices. ,Runtime Artifacts,the microservice and invocation path level,data is collected from both public datasets and an internal ARM server cluster deployment of the TrainTicket application.,"Data Collection, trace logs are parsed to extract the microservice invocation paths, then uses Density-Based Spatial Clustering of Applications with Noise (DBSCAN) to filter out noisy data points.",Variational Graph Autoencoder (VGAE) for learning spatial representations of traces and LSTM Autoencoders and DBSCAN,Graph Based ML,unsupervised learning,microservice invocation paths and response times derived from the trace logs,"precision, recall, F1-score, and area under the ROC curve (AUC) .",Classification and Prediction,"precision, recall, F1-score, and area under the ROC curve (AUC) .","performance is qualitatively assessed by injecting anomalies into the system (e.g., CPU and memory anomalies) and evaluating how well the system detects them.",evaluating how well the system detects them.,-,-,"Experiments are conducted on an ARM server cluster with Kubernetes and containers, and the performance metrics are collected using Prometheus. The model is implemented in PyTorch, and the experiments are run on an Intel Core i5-9500 CPU.","ARM server cluster with Kubernetes, and the performance metrics are collected using Prometheus. Implemented in PyTorch, and the experiments are run on an Intel Core i5-9500 CPU.",-,-,Success is measured by the system's ability to detect anomalies with high precision and recall while maintaining low false positives.,high precision and recall while maintaining low false positives.,Not available,Not available,"Data Sparsity: One challenge mentioned is the sparse nature of the trace data, as many traces only access a few microservices, resulting in sparse matrices when converting the data into TPG form ." A multi-model based microservices identification approach,Identification,daoud2021multi,"Daoud, M and El Mezouari, A and Faci, N and Benslimane, D and Maamar, Z and El Fazziki, A",2021,Journal,JOURNAL OF SYSTEMS ARCHITECTURE,JSA,Compendex,Identification,identification,Identification of microservices,automated,Microservices,Business process models (BPMN).,Model Artifacts,Activity-level,The case study is based on the Bicing bike rental system,"constructing dependency matrices for control, data, and semantic dependencies",Collaborative Clustering,Classical ML,Unsupervised learning,"Features include execution order (control), data flow (data dependency), and semantic similarity (functional activity names)","Dunn Index, Afferent and Efferent Coupling, Instability, and Relational Cohesion.",Clustering,"Dunn Index, Afferent and Efferent Coupling, Instability, and Relational Cohesion.","Dunn Index, Afferent and Efferent Coupling, Instability, and Relational Cohesion.","Dunn Index, Coupling, and Cohesion.",Case studies on the Bicing system ,the Bicing system.,Conducted using the business processes of the Bicing bike rental system.,,comparing to approach based on UML diagrams and centralized clustering,UML diagrams and centralized clustering,"Higher cohesion, lower coupling, and better clustering performance using collaborative clustering.","Higher cohesion, lower coupling and better clustering",Not available,Not available,Scalability: Handling larger business process models with many activities.Accuracy of Clustering: Ensuring that clustering results in highly cohesive and loosely coupled microservices Sage: Practical and scalable ML-driven performance debugging in microservices,Monitoring,gan2021_sage_performance_debugging ,"Gan, Y. and Liang, M. and Dev, S. and Lo, D. and Delimitrou, C.",2021,Conference,International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS,ASPLOS,ACM,Monitoring,root cause analysis,analyze performance ,Highly automated,"root cause microservice or system resource that triggered the QoS violation, along with proposed corrective actions for performance improvement.","traces of RPC-level interactions, latency metrics, and resource usage data (e.g., CPU, memory, network utilization).",Runtime Artifacts,trace microservice level,Synthetic (Apache Thrift) and open source( DeathStarBench),"Preprocessing involves gathering distributed traces, constructing dependency graphs of microservice interactions, and collecting resource usage data (e.g., CPU, memory, network). Metrics are normalized ",Causal Bayesian Networks (CBNs) and Graphical Variational Autoencoders (GVAEs),Graph Based ML,Unsupervised learning,"latency metrics, resource usage (CPU, memory, network), and RPC dependencies between microservices.","accuracy, QoS improvement","Classification and Prediction , System Behavior","accuracy, QoS improvement",The system is evaluated in both local clusters and Google Compute Engine (GCE) clusters,The system is evaluated in both local clusters and Google Compute Engine (GCE) clusters,"The system is compared to other performance debugging tools, such as Seer",,,,"Seer and other trace-based performance debugging tools,",,"improved QoS predictability, high root cause identification accuracy, and reduced overhead compared to other performance debugging tools.",,Not available,Not available,Scalability: Scaling the system to handle large clusters with numerous microservices requires careful optimization of the ML models. Latency Sensitivity: Ensuring that the ML models can accurately capture and predict latency-sensitive QoS violations across distributed services is another key challenge. Monolith to Microservices: Representing Application Software through Heterogeneous Graph Neural Network,Identification,mathai2021_hgnn_microservices_representation ,"Mathai, Alex and Bandyopadhyay, Sambaran and Desai, Utkarsh and Tamilselvam, Srikanth",2021,Conference,International Joint Conference on Artificial Intelligence ,IJCAI,Scopus,Identification,decomposition of monolithic systems into microservices.,Cluster components that should be refactored into microservices.,automated: number of clusters or the threshold for clustering components,Clusters of components that are recommended as candidate microservices,"source code (program files, functions, and resources)",Source Artifacts,class lavel,open source. monolithic applications,static analysis of programs (classes/functions) and their resource usages (such as database tables and files). ,graph neural network (GNN),Graph Based ML,unsupervised learning,types of nodes and the edges representing relationships between components.,"Modularity (Mod), Non-Extreme Distribution (NED), Coverage, S-Mod",Clustering,"Modularity (Mod), Non-Extreme Distribution (NED), Coverage, S-Mod","qualitative assessment through experimental case studies on different types of monoliths,","experimental case studies on different types of monoliths,",-,-,-,-,"COGCN++ : [Desai et al., 2021] ,HetGCNConv, CHGNN-EL, CHGNN","COGCN++,HetGCNConv, CHGNN-EL, CHGNN",ability to accurately cluster software components into potential microservices.,ability to accurately cluster ,Not available,Not available,"Dealing with heterogeneous data types (e.g., classes, functions, resources) and their relationships" Microservice extraction using graph deep clustering based on dual view fusion,Identification,qian2023_graph_clustering_extraction,"Qian, LF and Li, J and He, XD and Gu, RB and Shao, JW and Lu, YQ",2023,Journal,INFORMATION AND SOFTWARE TECHNOLOGY,IST,ScienceDirect,Identification,identification of microservices,"embedding node features, and clustering is performed to identify microservices.",Embadding generation and clustering,Microservices clusters,Runtime trace data from monolithic applications and class invocation graphs.,Runtime Artifacts,class level,"Four open-source Java monolithic applications: acmeair, daytrader, plants, and jpetstore.",structural and business analyses. Involves random walk algorithms to generate the business function view and sampling from structural views.,Graph Attention Neural Network (GAT).,Graph Based ML,"Unsupervised learning, utilizing attention mechanisms for node embedding and clustering.",Invocation relationships and business function dependencies.,"BCP (Business Context Purity), ICP (Interpartition Call Percentage), SM (Structural Modularity), IFN (Interface Number, NED (Non-Extreme Distribution).","Software Design, Clustering",-,same as evaluation metrics,-,four open source projects,four open source projects,Evaluated using five metrics,Evaluated using five metrics,"FoSCI, Mono2Micro, and GCN.","FoSCI, Mono2Micro, and GCN.",showing superior performance in modularity and independence.,showing superior performance in modularity and independence.,Not available,Not available,Scalability: Ensuring the method works with larger systems. Balancing clustering approaches for optimal microservice quality. Multilayered Fault Detection and Localization With Transformer for Microservice Systems,Monitoring,wang2024_fault_detection_transformer ,"Wang, JY and Li, YW and Qi, Q and Lu, Y and Wu, B",2024,Journal,IEEE TRANSACTIONS ON RELIABILITY,ToR,IEEE Xplore,Monitoring,Anomaly detection at the container level and localization of root cause metrics using a transformer-based model.,anomaly detection and localisation,automated: hyperparameters specification,detetcted anomalies,microservice containers metrics and invocation traces from service calls,Runtime Artifacts,Data is collected at both the container level and service layer.,"Three datasets are used: Sock-Shop dataset, AIOps2020 preliminary dataset, and SMD (Server Machine Dataset).","Involves normalization, creation of dependency graphs, and extraction of time-series features.",Transformer model for anomaly detection,Deep Learning,Unsupervised learning,"includes CPU usage, memory consumption, and latency metrics.","Precision, recall, F1 score, AUC (Area Under the ROC Curve), and AP (Average Precision) for anomaly detection. Precision@k and MAP for root cause localization.",Classification and Prediction,"Precision, recall, F1 score, AUC for anomaly detection. Precision@k and MAP for root cause localization.","AUC (Area Under the ROC Curve), and AP (Average Precision) for anomaly detection. MAP for root cause localization.","AUC (Area Under the ROC Curve), and AP (Average Precision) for anomaly detection. MAP for root cause localization.","Empirical evaluation on the Sock-Shop, AIOps2020, and SMD datasets.","Sock-Shop, AIOps2020, and SMD datasets.",Model implemented with PyTorch and evaluated using NVIDIA RTX3090 GPUs. Data is processed with batch sizes of 100.,,"baseline models like MicroRCA, MonitorRank, and Microscope.","MicroRCA, MonitorRank, and Microscope.",Improvements in F1 score by 0.25 and mean average precision by up to 0.54 for localization.,Improvements in F1 score ,Not available,Not available,Data volume: The large-scale data from multivariate time-series and real-time monitoring makes the analysis computationally expensive.Model efficiency: Ensuring the transformer-based model is efficient enough for real-time anomaly detection. Interpretable Failure Localization for Microservice Systems Based on Graph Autoencoder,Monitoring,sun2024_failure_localization_autoencoder,"Sun, Y and Lin, Z and Shi, B and Zhang, S and Ma, S and Jin, P and ...",2024,Journal,ACM Transactions on Software Engineering and Methodology ,TOSEM,ACM,Monitoring,"localization of root causes of failures in microservice systems. It analyzes multimodal data, including traces, logs, and metrics, to build a System Behavior Graph (SBG), which is then used to identify the faulty instances responsible for system failures.",Graph Autoencoder (GAE) for learning system dependencies and failure propagation patterns. The GAE enables the model to capture the relationships between instances in the microservice system and identify root cause instances by analyzing reconstruction errors and failure propagation patterns.,"The method automates the entire process, from collecting multimodal data (logs, traces, metrics) to constructing the system behavior graph and detecting failures through unsupervised learning with the GAE. Additionally, a feedback mechanism is integrated for continuous learning and fine-tuning.","interpretable localization of root cause instances, ranked by their likelihood of causing the failure. A root cause scoreis calculated for each instance, which is used to identify the most likely culprits.","multimodal monitoring data from microservice systems, including traces (invocation chains), logs (runtime messages), and metrics (performance indicators).",Runtime Artifacts,"system instance level. Each system instance (e.g., microservice or host instance) is analyzed for failures, and its relationships with other instances are modeled in the system behavior graph.","two open-source datasets from real-world microservice systems. These datasets include traces, logs, and metrics from simulated and real-world failures.","serializing traces, logs, and metrics into a multivariate time series and constructing the system behavior graph based on invocation and deployment dependencies between instances.","Graph Autoencoder (GAE), which learns representations of the system's normal behavior patterns and detects anomalies by measuring reconstruction errors and failure propagation patterns.",Graph Based ML,"The approach employs self-supervised learning. The GAE is trained using reconstruction tasks, without requiring manually labeled data for normal system operation. A feedback mechanism is used to incorporate operator corrections for continuous learning.",reconstruction errors (indicating the deviation from normal patterns) and failure propagation patterns (capturing the spread of anomalies through dependencies).,"Top-k accuracy (A@k) and Top-5 average accuracy (Avg@5), which measure the model's ability to rank the correct root cause within the top k instances.",Classification and Prediction,(A@k and Avg@5),"The method was assessed through comparison with nine baseline methods, showing significant improvements in accuracy, even with limited labeled data. The model's interpretability (via the root cause score) was a notable advantage.",The model's interpretability was a notable advantage.,"The method was benchmarked against other failure localization techniques, including MicroHECL, MicroRank, AutoMAP, DéjàVu, and DiagFusion, consistently outperforming them across several datasets.",5 techniques,"The experiments were conducted on two datasets (D1 and D2), collected from a simulated e-commerce system and a commercial bank's management system, respectively.",2 big datasets,"compared to a range of baseline approaches, including both non-deep learning and deep learning methods. DeepHunt showed superior performance, particularly in scenarios with limited labeled data.",comapred with 9 approaches ,"Success is defined by the method's ability to localize root cause instances accurately with minimal labeled data, as well as the system's ability to learn continuously through operator feedback and data augmentation.",ability to localize root cause instances accurately with minimal labeled data,open source,Open source,"The model requires continuous fine-tuning to adapt to evolving systems, which may challenge its scalability." PUTraceAD: Trace Anomaly Detection with Partial Labels based on GNN and PU Learning ,Monitoring,zhang2022_putracead_trace_anomaly_detection ,"Zhang, Ke and Zhang, Chenxi and Peng, Xin and Sha, Chaofeng",2022,Conference,IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE),ISSRE,IEEE Xplore,Monitoring,"trace anomaly detection in microservice systems using a combination of Graph Neural Networks (GNN) and Positive-Unlabeled (PU) learning. The system detects trace anomalies based on causal relationships in service invocations, aiming to identify faulty traces with minimal labeled data​","Graph Neural Networks (GNNs) are used to represent traces as span causal graphs, preserving the hierarchical structure of service invocations. PU learning optimizes the model based on a small set of labeled positive samples (anomalous traces) and a large set of unlabeled traces. This combination helps the model learn efficiently even with minimal labeled anomaly data​","The system automatically parses trace logs, constructs span causal graphs, and trains a GNN-based model using a PU learning framework, requiring minimal manual labeling. It significantly reduces the effort needed for fault injection or manual annotation of anomalies","anomaly prediction for each trace, which classifies whether the trace is anomalous or normal. The system returns a binary decision after processing the span causal graph and applying the learned GNN model​","distributed traces from microservices, where each trace is composed of multiple spans representing service invocations. Spans include attributes such as operation name, response code, start and end times, and durations​",Runtime Artifacts,"span level (individual service invocations) and trace level (causal relationships between spans). Each span is represented as a node in the span causal graph, and the entire trace is modeled as a graph​","open-source microservice benchmark system called TrainTicket, widely used for research on microservice architectures. This dataset includes traces from a variety of faults injected into the system​","Preprocessing involves span embedding, where service and operation names, response codes, and time-related features are embedded using methods like BERT and one-hot encoding. The spans are then used to build a span causal graph, where each node contains the span’s embedded features​","The approach uses Graph Neural Networks (GNN), specifically Graph Attention Networks (GAT), to model the causal relationships between spans. The PU learning method is employed to train the model with limited labeled data while effectively leveraging the knowledge of unlabeled traces​",Graph Based ML,"PU learning (Positive-Unlabeled learning), which allows the model to learn from a small number of labeled anomalous traces and a large number of unlabeled traces. This approach reduces the need for extensive labeling​","semantic features (service and operation names), time-related features (start time, duration, waiting time), and response codes (HTTP status codes), which are used to construct the span embeddings​","precision, recall, and F1-score to assess the anomaly detection performance. The model achieved a precision of 0.905, recall of 0.987, and F1-score of 0.944, outperforming several unsupervised baselines",Classification and Prediction,"precision, recall, and F1-score","significant improvements over unsupervised methods like TraceAnomaly and MultimodalTrace, particularly in handling both labeled and unlabeled trace types. The method’s ability to detect anomalies with limited labeled data is one of its key strengths​","outperformed unsupervised approaches, slightly underperforms a supervised learning-based approach","compared against TraceAnomaly and MultimodalTrace, achieving better precision, recall, and F1-score across various configurations. It also slightly underperformed a fully supervised variant of the model in some cases​ (SupervisedTraceAD)","better precision, recall, f1score compared to 2 unsupervised approaches (TraceAnomaly and MultimodalTrace)","conducted on a Kubernetes cluster using the TrainTicket microservice system, with faults injected using ChaosMesh. The model was implemented using Python 3.9.7 and PyTorch 1.9.1​",Kubernetes,TraceAnomaly and MultimodalTrace (unsupervised) and SupervisedTraceAD (supervised),"2 unsupervised approaches outperformed, underperformed supervised one","measured by the model’s ability to accurately detect anomalies with high precision and recall, while requiring minimal labeled data. The system’s performance with just 5% labeled anomalous traces was a key metric for success​",high precision and high recall,open source,open source,"Since the model relies on a small subset of labeled data, its ability to generalize to unseen anomaly types depends on the diversity of the labeled traces. The system must balance between learning from the limited labeled data and the vast set of unlabeled traces​, The model is trained with a small portion of labeled anomalous data (only 5% of all anomalous traces), which may lead to challenges in handling complex or rare anomalies that are underrepresented in the training set" MAAD: A Distributed Anomaly Detection Architecture for Microservices Systems,Monitoring,tan2024_maad_anomaly_detection,"Tan, R and Li, Z",2024,Conference,2024 IEEE International Parallel and Distributed …,IPDPS,IEEE Xplore,Monitoring,"distributed anomaly detection in microservices systems. Each agent deployed alongside a microservice performs real-time anomaly detection locally, reducing dependency on centralized processing and enabling faster detection.","Lightweight machine learning models are integrated into each agent for local anomaly detection. The model extracts and propagates features across agents to capture contextual and graph structure information, enhancing detection accuracy.","MAAD performs anomaly detection at each service level without manual intervention, employing agents that locally detect and communicate abnormal behaviors across the service chain.","anomaly detection decision for each service request, which identifies if and where anomalies occur within the service chain, with high precision and recall results.",The inputs consist of logs and traces captured from service executions within the microservices architecture.,Runtime Artifacts,"MAAD analyzes data at both span and trace levels—each span representing individual service invocations, and traces representing complete request paths.","The system was tested on TrainTicket and MicroSS datasets, which are synthetic benchmarks commonly used for microservices systems.","MAAD includes log and trace parsing, word embedding, sentence vectorization, and feature extraction, transforming logs and traces into vectorized formats for anomaly detection.","MAAD employs CNN and LSTM models within each agent, alongside a Multi-decision Merger model (using HistGradientBoosting) for aggregating span-level decisions.",Deep Learning,"Supervised learning is used, with agents trained on labeled traces and spans for fault classification.","Selected features include log vectors, span vectors, and propagated parent features, capturing critical contextual information relevant to anomaly detection.","The evaluation metrics include precision, recall, and F1-score, with MAAD achieving up to 99.6% recall and 95.8% precision on TrainTicket.",Classification and Prediction,"precision, recall, and F1-score",Qualitative comparisons with other models (like DeepLog and LogAnomaly) show that MAAD's distributed design improves detection accuracy and reduces latency.,mproves detection accuracy and reduces latency.,"MAAD is compared to centralized and IoT-based anomaly detection methods, achieving significantly better performance in accuracy and lower data transfer needs.","compared to centralized and IoT-based anomaly detection methods,","Experiments were conducted on a Kubernetes cluster with Intel XEON CPUs, simulating a distributed microservices environment.",Kubernetes,"MAAD was compared against DeepLog, LogAnomaly, and other centralized models; it outperforms these baselines, especially in recall and detection latency.","DeepLog, LogAnomaly, and other centralized models","Success is defined by higher detection accuracy, reduced detection latency, and minimal data transfer, achieving improved F1-scores and lower resource consumption.",improved F1-scores and lower resource consumption.,open source,open source,"Ensuring graph structure retention in a distributed setup is challenging but essential for detection accuracy, Designing lightweight models that perform well on CPU-only environments without GPUs." Implementation of Domain-oriented Microservices Decomposition based on Node-attributed Network,Identification,cao2022_domain_oriented_decomposition,Lingli Cao and Cheng Zhang,2022,Conference,Proceedings of the 2022 11th International Conference on Software and Computer Applications,ICSA,IEEE Xplore,Identification,microservice decomposition of monolithic systems by analyzing system behavior and characteristics. The method combines both static and dynamic analysis to generate microservice candidates by constructing a node-attributed network of system components​,"community detection algorithms and a hierarchical clustering algorithm to group system components into microservice candidates. These algorithms optimize the decomposition by clustering based on functional requirements, inter-service communication, and performance​","The system automates the collection of invocation relationships and performance metrics, construction of node-attributed networks, and the clustering of microservice candidates. The process eliminates the manual steps of microservice identification and improves decomposition quality by using objective measures​","microservice candidates, each representing a cohesive group of methods and services based on dynamic and static behavior characteristics. The method produces a decomposition that adheres to microservice principles of high cohesion and low coupling​",dynamic monitoring logs and static invocation relationships between methods in the monolithic system. The dynamic logs capture system behaviors like invocation times and response times during runtime​,"Source Artifacts, Runtime Artifacts","method level, analyzing individual method invocations and performance metrics to ensure fine-grained microservice decomposition​","dynamic analysis tools (e.g., Kieker for monitoring runtime behavior) and static analysis tools (e.g., Java-callgraph for capturing invocation relationships in the source code). The system is validated using the open-source JPetStore",filtering irrelevant invocation data from the logs and normalizing the static and dynamic data formats to ensure consistency. This allows the system to generate a unified representation of method relationships for the node-attributed network​,community detection algorithm and a similar hierarchical clustering algorithm. These algorithms work together to identify strongly connected components that could be grouped as microservice candidates​,Classical ML,"This approach is unsupervised. The system uses community detection to automatically cluster components without labeled data, relying on invocation patterns and method performance metrics","method invocation frequency, response times, and invocation relationships. These features help the system group methods that interact frequently and perform similarly into cohesive microservices​","cohesion (M_Ch), coupling (M_Cp), and density, which measure the internal connectivity and independence of microservice candidates. These metrics ensure that the decomposition adheres to the principles of microservice design​","Clustering, Software Design","M_Ch, M_Cp, density","The method improves upon manual decomposition by automating the process and producing more objective, efficient results. It ensures better performance and communication management when applied to real-world projects like JPetStore​",better performance and communication management,"compared against other decomposition techniques, such as FoSCI, DFD, and manual decomposition. The proposed method achieved better cohesion and coupling results, demonstrating improved microservice candidate quality​","other decomposition techniques, such as FoSCI, DFD, and manual decomposition","tested on the JPetStore project, a well-known benchmark for microservice decomposition methods. The method was evaluated on a typical monolithic system of moderate size, providing a fair comparison against other techniques​",1 OS project (JpetStore),"compared with FoSCI, DFD, and code2vec-based decomposition. It outperformed these methods in terms of cohesion, coupling, and flexibility, especially due to its fine-grained, method-level decomposition​","outperformed other methods in terms of cohesion, coupling, and flexibility","ability to decompose monolithic systems into microservice candidates with high cohesion and low coupling. The method significantly improves performance, communication management, and the accuracy of microservice decomposition​",High precision and low coupling,Open source,Open source,"Balancing multiple attributes (e.g., performance, invocation frequency) within the node-attributed network while ensuring meaningful clustering presents a challenge in terms of feature weighting and model tuning​" Code Vectorization and Sequence of Accesses Strategies for Monolith Microservices Identification,Identification,faria2022_code_vectorization_microservices,"Faria, Vasco and Silva, Antonio Rito",2022,Conference,International Conference on Web Engineering. Cham: Springer,ICWE,Compendex,Identification,identification of microservices from monolithic applications by vectorizing code using the Code2Vec neural network model and applying clustering algorithms based on functionality and sequence of accesses to domain entities​,"Code2Vec neural network to generate embeddings for each method in the monolith, transforming code into vectors. These vectors are then used to analyze the system’s functionality and sequence of accesses to domain entities, enabling the system to cluster the components into potential microservices​","The system fully automates the process of code vectorization, method dependency extraction, and microservice clustering, using machine learning and static analysis to identify microservice candidates without manual intervention​",clusters of functionalities or domain entities that are candidates for microservice decomposition. These clusters represent groups of methods or entities that form cohesive and loosely coupled microservices,"monolith codebases written in Java, which are parsed and analyzed using abstract syntax trees (ASTs). Code2Vec converts the method bodies into vectors, while the system tracks function invocations and entity access sequences​",Source Artifacts,"method level. It processes individual method invocations and their relationships, transforming them into vectors that capture the core functionality of the codebase​",85 monolithic Java-based codebases obtained from GitHub. These codebases were filtered to include systems with at least five domain entities and controller classes​,"JavaParser to analyze the Java codebase and extract methods, class dependencies, and package information. The Code2Vec model is applied to generate method embeddings, which are then used to analyze the code’s functional structure​","Code2Vec model, a neural network that generates fixed-length embeddings for each method in the monolithic application. These embeddings are used to capture the functional and lexical properties of the code, which are then clustered to identify potential microservices​",Classical ML,"unsupervised learning via clustering techniques. Methods are clustered based on the similarity of their Code2Vec-generated embeddings and access sequences to domain entities, creating natural groupings that represent microservices​","method embeddings (generated by Code2Vec), functionality call graphs, and sequence of accesses to domain entities. These features capture the functional and structural dependencies in the monolithic code","cohesion, coupling, and complexity",Software Design,"cohesion, coupling, and complexity",the use of Code2Vec embeddings and functional vectorization produces more cohesive and less coupled microservices compared to other approaches. The system outperforms traditional static analysis-based methods in terms of simplicity and adaptability to different technologies​,more cohesive and less coupled microservices compared to other approaches,/,/,"The experiments were conducted using 85 Java codebases from GitHub, filtered based on the number of domain entities and the Spring Data JPA library dependencies. The system was evaluated using cohesion, coupling, and complexitymetrics​",85 Github Java app,"compared against methods that rely on static code analysis and class vectorization. The Code2Vec-based approaches showed superior results, especially in terms of the balance between simplicity and performance in identifying microservice candidates",outperformed methods that rely on static code analysis and class vectorization,"ability to identify cohesive and loosely coupled microservices, as indicated by improvements in the cohesion, coupling, and complexity metrics over traditional static analysis methods",cohesive and loosely coupled MS,Open source,Open source,"The quality of the decompositions relies heavily on the accuracy of the Code2Vec vectors. If the model fails to capture the functional dependencies between methods accurately, it may lead to incorrect microservice boundaries​" Microservices performance forecast using dynamic Multiple Predictor Systems,Monitoring,santos2024_performance_forecast_microservices ,"Santos, WRM and Sampaio, AR Jr and Rosa, NS and Cavalcanti, GDC",2024,Journal,ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE,Eng. Appl. Artif. Intell.,ScienceDirect,Monitoring,performance forecasting of microservices to prevent performance degradation.,"Predicting resource utilization metrics (CPU, memory, response time) to support autoscaling decisions.",Highly automated,autoscaling decisions,"Time series data of performance metrics, including CPU usage, memory, response time, and traffic",Runtime Artifacts,microservice level,synthetic series generated by tools like Locust and real-world series collected from Alibaba's production cluster.,Time series are preprocessed using normalization (min-max scaling) and transformed into sliding windows,"ensemble of six algorithms: ARIMA, Multilayer Perceptron (MLP), Support Vector Regressor (SVR), Random Forest (RF), XGBoost, and Long Short-Term Memory (LSTM).",Classical ML,Supervised learning,"CPU usage, memory, and traffic.",Root Mean Square Error (RMSE) is used to measure prediction accuracy.,Classification and Prediction,Root Mean Square Error (RMSE) for accuracy.,-,-,Alibaba's production cluster.,Alibaba's production cluster.,Data comes from both synthetic series generated by Locust and real-world series collected from Alibaba's production cluster.,,"Comparison between single-model forecasting (Classical Forecasting Approach, CFA) and the proposed Multiple Predictor Systems (MPS).","Classical Forecasting Approach, CFA",Achieving better accuracy in predicting resource usage for autoscaling ,Heigher accuracy ,open source and available,open source and available,Computational complexity: Managing the overhead of training and selecting multiple models dynamically.Handling real-time demands: Ensuring the system can respond quickly enough to support real-time autoscaling decisions.Model selection: Choosing the right model for each test pattern in dynamic environments without manual intervention. DeepRest: Deep Resource Estimation for Interactive Microservices,Deployment,chow2022_deeprest_resource_estimation,"Chow, K.-H. and Deshpande, U. and Seshadri, S. and Liu, L.",2022,Conference,EuroSys 2022 - Proceedings of the 17th European Conference on Computer Systems,EuroSys,ACM,Deployment,resource estimation and sanity checks,predict the resource consumption based on API traffic patterns and sanity checks,Fully automated,"resource consumption predictions, such as CPU usage, disk IO operations, memory usage, and throughput, based on API traffic patterns.","API traffic logs, traces, and resource metrics (CPU usage, memory usage, IO operations) from the system.",Runtime Artifacts,Logs of the component level,real-world industrial applications,"data collected using distributed tracing tools Jaeger, then constructing a directed invocation graph, and transforming traces into structured features","deep neural network (DNN) with a gated recurrent unit (GRU) layer for time-series prediction, combined with an attention mechanism ",Deep Learning,Unsupervised learning,"Features include API invocation paths, resource usage metrics (CPU, memory, IO operations), and the interaction between components based on API traffic.","resource estimation accuracy: Mean Absolute Percentage Error (MAPE), Throughput and Inference Time",System Behavior,"Mean Absolute Percentage Error (MAPE), Throughput and Inference Time"," real-world applications (social network and hotel reservation systems) and compared to three baselines: resource-aware deep learning, simple scaling, and component-aware scaling.",,real-world applications and compared to three baselines,,"The experiments were conducted on a cluster of Docker containers orchestrated by Kubernetes, with Prometheus for monitoring and Jaeger for tracing. The system was trained for 30 epochs using stochastic gradient descent (SGD).",Virtual setup: Docker containers and Kubernete. 30 epochs using stochastic gradient descent (SGD).,"DeepRest was compared to resource-aware DL, simple scaling, and component-aware scaling approaches","resource-aware DL, simple scaling, and component-aware scaling approaches",DeepRest outperforms baseline methods with 7.86-11.19% error in CPU estimation compared to 16.06-82.56% error for baseline methods.,DeepRest outperforms baseline methods in error in CPU estimation ,proprietary system using PyTorch,proprietary system using PyTorch,"learning from unstructured traces and transforming them into structured features for the deep learning model, Resource utilization across components is often correlated, and the model must capture these correlations to improve estimation accuracy." MicroMatic: Fully Automated Microservices Identification Approach From Monolithic Systems,Identification,trabelsi2024_micromatic_automation,"Trabelsi, I and Popa, B and Péreyrol, J and Beaulieu, PO and ...",2024,Workshop,International Workshop on Software Engineering Research & Practices for the Internet of Things ,Serp4iot,IEEE Xplore,Identification,Identification of microservices from monolith system,"sematic analysis, classification phase to label classes as Application, Entity, or Utility services. Clustering ",Highly automated,microservices by grouping classified service types,Source code,Source Artifacts,Classes,Open source data,Static to KDM and sematic analysis ,"CodeBERT, Ensemble Learning, SVM, Decision Trees, and others for classification of classes. Community detection algorithms (e.g., Louvain, Girvan-Newman) for clustering.",Classical ML,"Unsupervised to clustering and sematics analysis, supervised learning","Features generated from pre-trained models like CodeBERT, and static and semantic relationships.","Metrics used include precision, recall, and F-measure ",Classification and Prediction,"precision, recall, and F-measure ",: A case study on FXML-POS,: A case study on FXML-POS,A case study on FXML-POS,FXML-POS,"The experiment used systems like FXML-POS for validation, with classification and clustering algorithms applied on the source code. and compare the results to a ground truth",ground truth,MicroMiner,MicroMiner,Heigh automation,heigh automation,web application with a GitHub repository,web application with a GitHub repository,"mbalanced Data: One challenge in ML is handling imbalanced data, addressed by using SMOTE (Synthetic Minority Over-sampling Technique) for better classification of underrepresented service types.Generalizability: Training models on Java systems poses a challenge in generalizing the tool for systems developed in other programming languages like C or Python.Model Performance: Achieving accurate classification and clustering depends on the quality of the training data and hyperparameter tuning, which can vary between systems​" Microservice Deployment in Edge Computing Based on Deep Q Learning,Deployment,lv2022_deployment_edge_computing,"Lv, Wenkai and Wang, Quan and Yang, Pengfei and Ding, Yunqing and Yi, Bijie and Wang, Zhenyi and Lin, Chengmin",2022,Journal,IEEE Transactions on Parallel and Distributed Systems,TPDS,IEEE Xplore,Deployment,Automated deployment strategy for microservices in edge computing. Uses RL-based optimization to balance load and reduce latency.,Uses Reward Sharing Deep Q Learning (RSDQL) to optimize microservice deployment on edge nodes.,Semi-automated with continuous monitoring and periodic deployment adjustments.,,"Resource usage data (CPU, memory, network) and microservice invocation metrics from edge nodes.",Runtime Artifacts,"Service-level data for resource usage, monitoring each microservice's needs and interactions.",Collected from Kubernetes edge environment with real and synthetic loads for testing.,Normalization and encoding of resource metrics and invocation data for model input.,Deep Q-learning with reward sharing (RSDQL) for edge deployment.,Reinforcement Learning,"Reinforcement learning, learning through environmental interactions.","Invocation frequencies, CPU/memory usage, node availability.","Average response time, load balance, scalability under requests.",System Behavior,"Response time, load balance, scalability","Compared with Kubernetes default and interaction-aware strategies, based on performance.","Kubernetes comparison, interaction-aware, performance feedback","Benchmarked against standard Kubernetes strategies, improvements in response/load.",improvements in response/load,"Kubernetes cluster with master and five nodes, using BookInfo and DeathStarBench.","Kubernetes cluster, BookInfo, DeathStarBench","Outperforms Kubernetes and interaction-aware strategies in response, load, scalability.",Outperformance,"Reduced response times, balanced load, and elastic scaling under real-time demand.","Reduced response times, balanced load, elastic scaling",,, From Monolithic Architecture Style to Microservice one Based on a Semi-Automatic Approach ,Identification,selmadji2020_transition_microservices,"Selmadji, Anfel and Seriai, Abdelhak-Djamel and Bouziane, Hinde Lilia and Oumarou Mahamane, Rahina and Zaragoza, Pascal and Dony, Christophe",2020,Conference,IEEE International Conference on Software Architecture (ICSA),ICSA,IEEE Xplore,Identification,Automated microservice identification guided by recommendations and quality metrics.,"Hierarchical clustering algorithm for grouping classes, incorporating quality and architect's recommendations",Semi-automated with architect recommendations at the start of the process.,microservices,"Source code, persistent data relationships, architect recommendations.","Source Artifacts, Domain Artifacts",Class level,Java applications of various sizes collected from GitHub.,"Quality function evaluation, clustering with architect guidance, data autonomy verification.",Hierarchical clustering algorithm.,Classical ML,Semi-supervised with architect intervention.,"Structural dependencies, behavioral autonomy, and data autonomy.","Microservice quality based on structural cohesion, behavioral independence, and data autonomy.",System Behavior,"Cohesion, behavioral independence, data autonomy.",Comparison to manually identified microservices with high recall and precision.,manual identified microservices,Benchmarked against manually identified microservices across applications.,Comparison on real Java data.,Java projects tested with small to large class sizes.,several java projetxs,Compared with heuristic-based approaches; outperforms in terms of alignment with microservice semantics.,Compared to heuristic approaches,High recall with accurately clustered microservices and minimized dependencies.,"High recall, independent microservices with minimal dependencies.",,, Anomaly Detection and Diagnosis for Container-Based Microservices with Performance Monitoring,Monitoring,du2018_anomaly_container_microservices,"Du, Qingfeng and Xie, Ti and i and He, Yu",2018,Conference,"Algorithms and Architectures for Parallel Processing: 18th International Conference, ICA3PP",ICA3PP,Compendex,Monitoring,"The primary task automated here is real-time anomaly detection and diagnosis using machine learning models, specifically targeting container performance and fault detection.",ML is used to monitor and analyze real-time performance metrics for anomaly detection and diagnose issues within containerized microservices. Supervised learning techniques classify service states as normal or anomalous.,Real-time data collection and fault detection are fully automated.,,"various performance metrics such as CPU usage, memory utilization, network latency, and packet loss.",Runtime Artifacts,CPU and memory usage rates for each container and service,"Data is sourced directly from real-time performance monitoring tools within containerized environments, often through agents like cAdvisor and Heapster.","fault injection and labeling to simulate anomalous and normal conditions, which helps the model learn the distinctions.","Support Vector Machines (SVM), Random Forest, Naive Bayes, and k-Nearest Neighbors (kNN), for anomaly classification.",Classical ML,"A supervised approach is used, with training data labeled based on normal and fault-injected scenarios to classify behaviors.","CPU metrics (usage, limit, usage-rate), memory metrics (total usage, cache, RSS, page faults), and network metrics (bytes received/transmitted, error rates)","precision, recall, and F1-score",Classification and Prediction,"precision, recall, and F1-score",The system’s performance is evaluated qualitatively through fault injection experiments to simulate realistic operational conditions.,fault injection experiments,"Random Forest and kNN algorithms achieve the highest performance, with precision and recall both scoring above 0.9.","precision/recall of 0,9","Experiments were conducted on a Kubernetes cluster using virtual machines, with tools like cAdvisor and InfluxDB for data collection and Grafana for visualization.",Kubernetes cluster (cAdvisor and InfluxDB for data collection and Grafana for visualization.),"The document compares different classifiers (SVM, Naive Bayes, etc.) and finds that Random Forest and kNN outperform others in accuracy and robustness.","compares different classifiers (SVM, Naive Bayes, knn)","Success is defined by the model's ability to detect anomalies with high precision and recall, maintaining service reliability by identifying faults quickly.","ability to detect anomalies with high precision and recall,",,,"Challenges mentioned include selecting the right metrics to monitor, handling high volumes of real-time data, and ensuring model accuracy in varied real-world conditions. Additionally, fault injection in a live environment poses logistical and operational challenges." DeepTraLog: Trace-Log Combined Microservice Anomaly Detection through Graph-based Deep Learning ,Monitoring,zhang2022_deeptralog_combined_anomaly,"Zhang, Chenxi and Peng, Xin and Sha, Chaofeng and Zhang, Ke and Fu, Zhenqing and Wu, Xiya and Lin, Qingwei and Zhang, Dongmei",2022,Conference,IEEE/ACM 44th International Conference on Software Engineering (ICSE),ICSE,IEEE Xplore,Monitoring,Data extraction and analysis are automated to streamline the migration phase.,"Machine learning is used in data preparation, prediction, and clustering for anomaly detection.","Automation is extensive, with minimal human intervention across key tasks.",,"code, logs, and UML diagrams, each offering unique insights for anomaly analysis","Runtime Artifacts, Model Artifacts","entire files to individual lines, allowing precise anomaly tracking.","open-source, industrial, and user-contributed inputs.",/,"SVM, GNN, and DL are selected based on data complexity.",Graph Based ML,Supervised and unsupervised learning are used for predictive and anomaly detection tasks.,"event sequence features (like service invocation patterns and response times), specific log features (such as log event types, message patterns, and parameter values), and graph-based features from the Trace Event Graph (TEG), which capture node representations and edge relationships. Additionally, temporal features (including timestamp patterns and event frequency) and contextual features (such as counts of specific event types and trace complexity)","precision, recall, and F1-score are used to quantitatively evaluate the model's performance in detecting anomalies.",Classification and Prediction,"precision, recall, and F1-score",qualitative assessments are conducted through user feedback and manual review of detected anomalies.,"Feedback, Review","compared against benchmarks, such as false positive rates and response times, to ensure it meets or exceeds standards set by previous approaches or industry expectations.","False Positives, Comparison","high-performance hardware setups like GPUs and CPUs, enabling the efficient training and testing of complex models.",CPU/GPU efficiency,"The model is compared to alternative anomaly detection methods to evaluate improvements in precision, recall, and processing speed",Outperforlmance,"higher recall, improved anomaly detection quality, and reduced processing time, indicating the model’s efficiency in practical applications.","High recall, reducion of detection time",,,"Key challenges include managing the complexity of real-time analysis, which requires fast, accurate processing, and ensuring data validity despite varied sources and formats. These challenges can impact the model's reliability and performance." Extracting Candidates of Microservices from Monolithic Application Code ,Identification,kamimura2018_microservice_candidates,"Kamimura, Manabu and Yano, Keisuke and Hatano, Tomomi and Matsuo, Akihiko",2018,Conference, 25th Asia-Pacific Software Engineering Conference (APSEC),APSEC,IEEE Xplore,Identification,Identifying microservices candidates through software clustering,The process uses the SArF software clustering algorithm to group related software entities and reduce manual efforts in identifying candidates,Semi-automated; human validation is still required for final decomposition decisions,microservices,"Program source code and its static relationships (method calls, data accesses)",Source Artifacts,method level,"Source code of monolithic applications, entry points defined by control annotations in code, data annotated in Java as tables","Data preprocessing includes static analysis to identify dependencies, entry points definition, and data annotation extraction for setting relationships within the code.",sArF algorithm,Classical ML,"The approach is unsupervised, as it clusters components based on predefined relationships without labeled training data","he features include dependencies (both program and data), program calls, and inheritance hierarchies between classes","Metrics include dependency reduction, candidate count, and alignment with business functions. Coupling is also assessed to ensure minimal dependencies",Clustering,"modularity, coupling, dependency reduction, candidate count",Assessment relies on developer feedback to verify the functional alignment and practical usability of the extracted candidates,developer validation,The method is compared against a naming-based grouping approach and the microservice version of the application to evaluate accuracy and coupling.,naming-based grouping,Case studies on both open-source applications (Spring Boot Pet Clinic) and industrial COBOL systems were used to assess the method,2 case studies ,The approach is compared to manual extraction methods and grouping based on naming conventions to highlight improvements in automation and precision.,manual extraction,"Success is defined by alignment with existing microservices and a reduction in dependencies between candidates, indicating readiness for microservices architecture.",alignment with existing microservices and a reduction in dependencies between candidates,,, GTMicro—microservice identification approach based on deep NLP transformer model for greenfield developments,Identification,bajaj2024_gtmicro_nlp_microservices,"Bajaj, D and Bharti, U and Gupta, I and Gupta, P and Yadav, A",2024,Conference,International Journal of Information Technology,IJIT,Compendex,Identification,"identification of microservice candidates by calculating semantic textual similarity (STS) between use cases, grouping them based on similarity, and clustering them into potential microservices. This process includes both data preparation(use case embedding) and clustering phases​","Machine learning is applied during the data preparation phase through the BERT model, which transforms use case descriptions into vector embeddings. The STS between embeddings is calculated to determine similarity, and hierarchical clustering is used to group similar use cases into microservices",The process is highly automated; from use case embeddings to clustering into microservices,microservices,"The input data consists of use case descriptions, which are textual descriptions of application functionality typically found in requirement documents at the early stages of software development",Model Artifacts,The data is at the use case level,The approach is validated on two sample applications: JPetStore (an open-source Java e-commerce application) and TFWA (a proprietary teachers' feedback application),"embedding use case descriptions into vector representations using BERT. These embeddings are then used to calculate pairwise cosine similarity scores, forming a similarity matrix. This matrix is then fed into a hierarchical clustering algorithm to identify microservice candidates​","The primary machine learning technique used is BERT, a deep learning model within the Transformer architecture, for semantic textual similarity (STS) between use cases​",Deep Learning,The approach is unsupervised; it uses pre-trained BERT embeddings without any labeled data,The features are derived from semantic embeddings of use case descriptions,"precision, recall, accuracy, and F1 score on identified microservices",Classification and Prediction,"precision, recall, accuracy, and F1 score","The approach qualitatively assesses microservice cohesion and modularity based on the logical grouping of related use cases, ensuring that each microservice represents a cohesive set of functionalities",microservice cohesion and modularity,/,/,"The evaluation was conducted on two applications (JPetStore and TFWA) with no specific mention of hardware(GPUs/CPUs). The BERT embeddings were computed using Python libraries, and microservices were deployed using tools like Docker and Microsoft Azure for practical validation​",JPetStore and TFWA,"compared with FoSCI, CoGCN, Mono2Micro, and MEM","compared with FoSCI, CoGCN, Mono2Micro, and MEM","Success is measured by high F1 scores and precision/recall values for microservice identification, as well as improved modularity, reduced inter-service calls, and better business context purity","high F1 scores and precision/recall,mproved modularity, reduced inter-service calls, and better business context purity",,,"The GTMicro approach faces challenges related to data quality and completeness of use cases, semantic similarity limitations that may miss business nuances, restricted generalizability to greenfield applications, and the computational demands of BERT embeddings for large-scale systems​" Automatic Microservices Identification from a Set of Business Processes - Smart Applications and Data Analysis,Identification,daoud2020_microservice_identification,"Daoud, Mohamed and Mezouari, Asmae El and Faci, Noura and Benslimane, Djamal and Maamar, Zakaria and Fazziki, Aziz El",2020,Conference,"Smart Applications and Data Analysis: Third International Conference, @ 2020",SADASC,Compendex,Identification,"analyzing dependencies within business processes (BPs), clustering related activities, and identifying microservice candidates ","Machine learning is indirectly integrated in the clustering phase, where algorithms are used to evaluate control and data dependencies within BPs. Clustering groups activities based on shared dependencies, predicting the best microservice composition.","The degree of automation is high in clustering tasks and dependency analysis. However, user input might still be required for assigning specific criticality levels to data attributes.",microservices,"business processes (BPs) represented as activities, with additional information on control and data dependencies.","Model Artifacts, Source Artifacts","individual activity relationships, control dependencies (execution order), and data flow between activities.","Data is derived from documented business processes within organizational systems, emphasizing control and data dependencies to create accurate microservice clusters.","Preprocessing includes extracting dependencies and structuring data into matrices, followed by calculating criticalities for data attributes","Collaborative clustering techniques are employed, using hierarchical agglomerative clustering (HAC) as the main method.",Classical ML,"The approach is unsupervised, where clustering is based on similarity in dependencies without labeled data","Key features include control dependencies (execution order, logical operators) and data dependencies (criticality levels for data attributes).","Metrics include cohesion and loose coupling within clusters, Dunn Index, and clustering quality.",Clustering,cohesion and loose coupling,"Dendograms visually assess the clustering, showing how control and data dependencies influence cluster composition.",clustering ,"The Dunn Index measures clustering effectiveness, balancing compactness within clusters against separation from other clusters.",Dunn index,Experiments are conducted with Java-based modules simulating the collaborative clustering algorithm over 14 BP activities and extended with additional synthetic data.,14 BPs,"The approach outperforms traditional clustering methods by maintaining dependency-specific matrices, preserving data granularity and accuracy",Outperformance,"Success is determined by higher clustering precision, better cohesion within microservices, and improved separation between distinct microservices.","higher clustering precision, better cohesion within microservices, and improved separation",,,"Main challenges include managing data quality degradation from dependency aggregation, handling complex data flow in real-world BPs, and ensuring validity in clustering decisions across diverse dependency types." Mono2Micro: an AI-based toolchain for evolving monolithic enterprise applications to a microservice architecture - Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering,Identification,kalia2020_mono2micro ,"Kalia, Anup K. and Xiao, Jin and Lin, Chen and Sinha, Saurabh and Rofrano, John and Vukovic, Maja and Banerjee, Debasish",2020,Conference,28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering,FSE,ACM,Identification,"Automated tasks include code instrumentation, metadata extraction, use-case trace creation, and clustering-based partitioning, reducing manual intervention in service decomposition.","Mono2Micro applies AI-based clustering for service partitioning, which evaluates business logic and data dependencies to suggest optimal service boundaries.","High degree of automation is achieved through AI-driven partitioning recommendations and automated trace analysis, although final validation by developers is encouraged.",microservices,"Inputs are Java classes, use-case execution traces, and dependency metadata collected from runtime and static analysis.","Source Artifacts, Runtime Artifacts","Data granularity includes individual classes and method calls, allowing for detailed analysis of dependencies and functional groupings at the class level.","Data is derived from instrumented application code, use-case executions, and code metadata extracted from the JEE application (DayTrader used as an example).",Preprocessing involves tagging traces with use-case labels and generating calling-context trees (CCTs) to structure data for partition analysis.,"Uses hierarchical clustering to group classes based on temporo-spatial relationships, generating partitions that respect business logic and data cohesion.",Classical ML,"Employs unsupervised clustering based on runtime and static analysis without pre-labeled data, allowing natural group formation based on use-case and dependency data.","Features include direct/indirect call relations, data dependencies, and business use-case labels, all of which guide the partitioning of services.","Metrics like inter-partition call percentage, business context purity, and structural modularity are used to evaluate partition quality.",Clustering,"ICP, BCP, SM","Expert developers validate partition quality, confirming alignment with expected microservice structures and providing feedback for further refinement.","Developer Validation, Refinement","Benchmarks are established through sensitivity analysis of clustering sizes, ensuring optimal performance for small, medium, and large datasets (DayTrader example).","Sensitivity Analysis, Scalability","Mono2Micro is tested on the DayTrader JEE application with varied use-case coverage, illustrating its scalability and adaptability to different partitioning needs.","DayTrader, Varied Use Case Coverage","Outperforms traditional static-only and dynamic-only approaches by combining both for a balanced, context-sensitive partitioning recommendation.","Static vs. Dynamic, Balanced Approach","Success is measured by the modularity and coherence of the partitions, as well as the reduction of inter-partition dependencies, yielding well-defined microservices.","Modularity, Cohesion",,,"Challenges include data granularity for accurate partitioning, alignment of business logic with technical dependencies, and handling complex dependency structures." From a Monolith to a Microservices Architecture: An Approach Based on Transactional Contexts ,Identification,nunes2019_transactional_contexts,"Nunes, Luís and Santos, Nuno and Rito Silva, António",2019,Conference,Lecture Notes in Computer Science,LNCS,Compendex,Identification,"identification and decomposition of a monolithic application into microservices (Collect Data, Generate Clusters, Vizualization of microservices)",in the clustering phase of monolith-to-microservices decomposition to automate the grouping of domain entities that share transactional contexts.,"The article’s approach automates most technical tasks, like data gathering, initial clustering, and visualization, but final decision-making relies on human judgment.",,Source code of java app,Source Artifacts,"domain entities, which are data components or objects that represent distinct elements of the application's business model (e.g., Customer, Order).","2 monolithic app LdoD,Blended Workflow",static analysis to generate a call graph,Hierarchical clustering algorithm.,Classical ML,"based purely on their similarity in transactional contexts, identified through static code analysis","the feature selection centers on controller-entity access patterns and transactional similarities, with irrelevant data excluded to focus on features that define optimal microservice boundaries","silhouette score, number of singleton clusters, maximum cluster size, and pairwise precision, recall, and F-score","Classification and Prediction , Clustering","NSC,MCS,SS,NRC, Precision, recall, F1score","against expert decompositions to ensure cohesive, well-defined microservice boundaries",expert decomposition,"The clustering algorithm and similarity metrics prioritize grouping domain entities frequently accessed together, thus reducing the need for cross-service calls. Additionally, larger clusters are avoided to prevent any single microservice from becoming a bottleneck, ensuring that each cluster can scale independently.",minimize inter-service communication,"2 case studies LdoD,Blended Workflow","LdoD,Blended Workflow","comparison with an existing tool, Structure101",Structure101,how well the resulting microservice clusters align with business logic cohesion and technical coherence.,"High Precision and Recall, Minimization of Inter-Service Communication, Balanced Cluster Sizes and Few Singletons",,,"Key concerns include balancing consistency and availability due to the CAP theorem, ensuring business logic coherence to avoid fragmentation, and the dependency on accurate data collection (e.g., limitations of java-callgraphin capturing Java 8 streams). Additionally, the approach’s reliance on the MVC pattern and specific tools (like Fénix Framework) limits its generalizability to other architectures. Lastly, the need for manual adjustments by architects introduces subjectivity, potentially leading to variations in the final microservice boundaries" Automatic Microservices Identification Across Structural Dependency - Lecture Notes in Networks and Systems,Identification,saidi2022_structural_dependency_microservices,"Saidi, Malak and Tissaoui, Anis and Benslimane, Djamal and Faiz, Sami",2022,Conference,Lecture Notes in Networks and Systems,LNNS,Compendex,Identification,The approach automates the entire microservices identification phase,The method applies K-means clustering during the identification phase to group business process activities into microservices based on structural dependencies,The process is highly automated in terms of dependency extraction and clustering,microservices,"The input is a Business Process Model (BPM), represented in BPMN (Business Process Model and Notation), capturing the sequence and structure of activities and dependencies within a system​",Model Artifacts,"The data is extracted at the activity level within business processes,","The data appears to be created by the authors for the purpose of this study, specifically using a Bicing system model as an example",XML parsing of the BPMN file generated by the Camunda Modeler tool to identify connectors and structural dependencies between activities. This parsed data is then transformed into a structural dependency matrix to serve as input for the clustering step​,K-means clustering,Classical ML,The learning approach is unsupervised,"implicitly chosen based on structural dependencies (e.g., direct and indirect control dependencies) between activities in the business process model. These dependencies serve as features in the dependency matrix, which K-means uses for clustering​","The article evaluates the clustering approach primarily through qualitative assessment rather than formal metrics like recall or precision. It does not specify quantitative metrics for measuring clustering quality, such as silhouette score or F-score​",Classification and Prediction,Only qualitatives ,cohesion and separation of the identified microservice candidates based on their structural dependencies in the business process.,cohesion and separation,/,/,"The experimental setup appears to be basic, using a single case study (the Bicing system), and does not require specialized hardware​",A single case study,,"2 tools mentioned, Amiri et al. and Daoud et al",improving the logical grouping of microservices by accurately capturing structural dependencies within the business process model,capturing structural dependencies within the business process model.,,,"The article highlights challenges in capturing complex dependencies accurately, limitations in generalizability due to specific dependency formulas and BPMN reliance, and a lack of quantitative validation, raising concerns about the approach's robustness across diverse applications​" Magnet: Method-Based Approach Using Graph Neural Network for Microservices Identification,Identification,trabelsi2024_magnet_graph_nn,"Trabelsi, I and Moha, N and Guéhéneuc, YG and ...",2024,Conference,International Conference on Software Architecture,ICSA,IEEE Xplore,Identification,Identification of microservices from monolith system,clustering of methods for microservices and sematic analysis,heighly automated,methods clusters that represents the microservices,Source code,Source Artifacts,Method level,Open source data,Static analysis with KDM and semantic analysis using word2vec,Deep Modularity Networks (DMoN) a type of GNNs,Graph Based ML,Unsupervised,"method calls, method bodies, and class structures represented as vectors using Word2Vec.","Precision, recall, f-mesure, SMQ (Structural Modularity Quality), CMQ (Conceptual Modularity Quality), CHM (Cohesion at Message level), and CHD (Cohesion at Domain level).","Classification and Prediction , Software Design",,Microservices quality using quality metrics,,Four systems,,Experiments conducted on four open-source systems with varied sizes. ,,Compared against ServiceCutter and MicroMiner,,"Improved modularity, functional independence, and reduced coupling.",,open-source tool on GitHub.,,generalizability across different types of monoliths. Need for fine-tuning in clustering parameters for larger and more complex systems. No constrains on the clusters size Service Cutter: A Systematic Approach to Service Decomposition ,Identification,gysel2016_service_cutter_decomposition ,"Gysel, Michael and Kölbener, Lukas and Giersche, Wolfgang and Zimmermann, Olaf",2016,Conference, Computer Science,Computer Science,Compendex,Identification,"Service Cutter automates the identification of service boundaries by analyzing software artifacts like domain models, use cases, and applying prioritized coupling criteria to suggest optimized service cuts.","Service Cutter uses graph clustering to achieve optimal service decomposition, based on predefined coupling criteria that function similarly to data clustering in ML.","The approach provides a semi-automated decomposition process where user-defined criteria weights influence clustering, resulting in a partially automated solution adaptable to specific decomposition needs.",microservices,"Inputs consist of system specification artifacts (SSAs) like DDD entities, ERMs, and use cases, capturing technical and functional aspects necessary for service decomposition.","Domain Artifacts, Model Artifacts","Data granularity is high, with 'nanoentities' (smallest decomposable units) identified in SSAs, representing fine-grained service candidates for precise boundary definitions.","Data is derived from various software engineering artifacts, ensuring diverse perspectives on functional and structural requirements for robust service boundary recommendations.","SSAs are converted into machine-readable formats, transforming coupling criteria into scores, creating weighted graphs where each edge signifies a coupling factor between nanoentities.","Integrates two clustering algorithms (Girvan-Newman and Epidemic Label Propagation), which use edge weights in the graph to find service clusters, similar to ML clustering but focused on coupling criteria.",Classical ML,"Employs unsupervised clustering based on coupling criteria scores, similar to unsupervised ML, where services emerge through clustering without predefined labels.","Features include cohesiveness, compatibility, constraints, and communication needs within nanoentities, helping define tight service boundaries while reducing unnecessary coupling.","Service Cutter uses developer feedback to classify cuts as 'excellent,' 'expected,' or 'unreasonable,' with performance tests on the clustering algorithms for response time metrics.","Developer-Centric, System Behavior","Developer Feedback, Cut Quality","User feedback validates service cuts, with the tool enabling exploratory adjustments to scoring priorities, allowing architects to refine cuts based on specific goals.","User Validation, Score Adjustment","Performance is assessed by clustering speed and response time, with results indicating that Service Cutter performs efficiently, even on complex models with over 600 nanoentities.","Clustering Speed, Efficiency","Experiments on sample applications ('Trading System' and 'Cargo Tracking') assess decomposition quality, with Service Cutter yielding acceptable or good results across use cases.","Sample Apps, Use Case Validation","Service Cutter’s coupling criteria and clustering approach provide a more structured, criteria-driven method compared to traditional, heuristic-based service decomposition, ensuring better alignment with system needs.","Structured Approach, Criteria-Driven","Success is defined by high-quality service cuts that meet architects’ expectations, with modularity and cohesion maintained in clustered components, validating the Service Cutter’s approach.","Quality Cuts, High Modularity",,,"Challenges include SSA setup time, potential biases in coupling prioritization, and handling complex requirements that may not always align perfectly with automated suggestions." Unsupervised Detection of Microservice Trace Anomalies through Service-Level Deep Bayesian Networks,Monitoring,liu2020_anomaly_detection_microservices ,"Liu, Ping and Xu, Haowen and Ouyang, Qianyu and Jiao, Rui and Chen, Zhekang and Zhang, Shenglin and Yang, Jiahai and Mo, Linlin and Zeng, Jice and Xue, Wenman and Pei, Dan",2020,Conference,IEEE 31st International Symposium on Software Reliability Engineering (ISSRE),ISSRE,IEEE Xplore,Monitoring,Automated detection of microservice trace anomalies,Uses deep Bayesian networks with posterior flow for unsupervised anomaly detection,Fully automated with real-time trace analysis,,"Microservice trace data, response times, and invocation paths",Runtime Artifacts,Trace-level granularity including call paths and service response times,Real-time traces collected from company S's production environment,Encoding of traces into service trace vectors (STVs) based on response times and invocation paths,Unsupervised deep learning with Bayesian networks,Deep Learning,Unsupervised anomaly detection,Invocation paths and response time patterns,Precision and recall in identifying true anomalies,Classification and Prediction,"Precision, recall, anomaly detection",Comparison with rule-based detection and feedback from operators,"Operator feedback, comparison to rule-based","Benchmarked against rule-based and seven other approaches, showing higher recall and precision","higher recall, precision",Deployed on 18 online services; evaluations with millions of real-time traces and testbed data,"Real-time deployment, testbed evaluation",Outperforms rule-based and baseline models in anomaly detection accuracy,"Outperformance, rule-based, baseline models","High precision and recall in real-world deployment, root cause localization accuracy","High precision, recall, accurate localization",,, An automatic extraction approach: transition to microservices architecture from monolithic application - Proceedings of the 19th International Conference on Agile Software Development: Companion,Identification,eski2018_microservices_extraction,"Eski, Sinan and Buzluca, Feza",2018,Conference,19th International Conference on Agile Software Development: Companion,XP,ACM,Identification,"analyzing static and evolutionary coupling between classes in code, creating software relation graphs, and using clustering to identify service boundaries.","the approach uses graph clustering algorithms that group code elements into microservice candidates based on their coupling metrics, mimicking data-driven grouping found in ML clustering.","The process is highly automated, reducing manual intervention needed to identify microservice boundaries in large codebases by applying pre-defined thresholds and clustering methods.",,"Data includes static code elements (inheritance, method calls) and evolutionary data (commit history showing co-changes in code), representing relationships that help define boundaries between services.",Source Artifacts,"Data granularity is fine, focusing on individual classes and their interactions within a monolithic application, providing detailed dependency and coupling insights.","Data is sourced from version control systems, which provide both the latest code structure and historical changes, useful for determining static and evolutionary couplings.","Preprocessing involves parsing code into abstract syntax trees, capturing dependencies, and calculating coupling metrics, ensuring data is in a structured format for clustering.",graph clustering algorithms (Fast Community) to group classes into potential microservices.,Graph Based ML,"The approach is unsupervised; clustering relies on structural and evolutionary data without labeled training data, which resembles unsupervised learning in ML.","Features include static dependencies (inheritance, method calls) and evolutionary coupling (change frequency), which are indicators of functional boundaries.","The similarity metric MoJoSim measures the alignment between clustered microservices and developer-defined services, assessing clustering accuracy.",Clustering,"MoJoSim, Similarity Metric","Developers validate the approach by comparing identified microservices with known service structures, identifying cases where clustering may over-segment or combine services.","Developer Validation, Comparison","Success rates for clustering are measured by comparing against baseline monolithic structures, with similarity scores reaching up to 89% for the proposed method.","Success Rate, Similarity Score","Experiments were conducted on Java-based projects with numerous revisions, using parameters set through extensive trial and error for optimal clustering results.","Java Projects, Revision Analysis","The method outperforms static-only or evolutionary-only approaches, as the combination of both coupling types improves clustering precision and aligns with expert decomposition.","Static vs. Evolutionary, Precision","Success is defined by high similarity scores with developer-defined microservice models, indicating that the extracted services align closely with the intended modular structure.","High Similarity, Service Alignment",,,"Challenges include potential inaccuracies in authoritative microservice definitions, generalizability to other types of software projects, and handling edge cases in coupling metrics." TraceModel: An Automatic Anomaly Detection and Root Cause Localization Framework for Microservice Systems,Monitoring,cai2021_tracemodel_microservices ,"Cai, Yang and Han, Biao and Su, Jinshu and Wang, Xiaoyan",2021,Conference,"17th International Conference on Mobility, Sensing and Networking (MSN)",MSN,IEEE Xplore,Monitoring,Automated detection of microservice anomalies and localization of fault root causes.,Integrates a Variational Autoencoder (VAE) for anomaly detection and ModelCoder for root cause localization.,"Fully automated, from anomaly detection to fault localization in near real-time.",,Service traces and response times collected from microservices to construct a service dependency graph (SDG).,Runtime Artifacts,Node-level granularity for each microservice interaction and response time analysis.,Monitoring data from a real-world cloud-based microservice environment.,"Construction of Service Dependency Graphs (SDGs) based on traces, normalization of response times for VAE input.",Variational Autoencoder (VAE) combined with fault model-based ModelCoder.,Deep Learning,Unsupervised learning with VAE for anomaly detection and supervised model-based fault localization.,"response times, trace paths, and fault patterns for detecting anomalies and localizing faults.","Precision, recall, and localization accuracy for root cause identification.",Classification and Prediction,"Precision, recall","Compared against random walk and similar fault localization methods, showing improved accuracy.",Comparison,Benchmarking results indicate higher accuracy and speed than traditional random walk methods.,"higher recall, precision",Tested on real-world microservice data from a cloud environment with high trace volumes.,"Cloud, high volume","Compared with random walk and other localization methods, demonstrating better accuracy.",Better accuracy,High accuracy in anomaly detection and root cause localization with minimal response time.,High accuracy,,,"TraceModel encounters challenges in handling complex, dynamic service dependencies, which complicates accurate fault isolation. High variability in response times risks false positives, making reliable anomaly detection difficult. Additionally, scaling the model for large microservice systems requires efficient, real-time localization without overwhelming resources, and it must adapt to various fault types like network and CPU issues, each needing distinct handling." ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,39,Identification,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,0,Packaging,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,12,Deployment,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,28,Monitoring,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,2,Pre-migration,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,81,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,80,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,#REF!,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,