Inline Drift Detection Using Monitoring Systems and Machine Learning in Selective Laser Melting

Direct metal laser sintering, an additive manufacturing technique, has a huge growing demand in industries like aerospace, biomedical, and tooling sector due to its capability to manufacture complex parts with ease. Despite many technological advancements, the reliability and repeatability of the process are still an issue. Therefore, there is a demand for inline automatic fault detection and postprocessing tools to analyze the acquired in situ monitoring data aiming to provide better‐quality assurance to the user. Herein, the treatment of the data obtained using the EOSTATE optical tomography monitoring system is focused. A balanced dataset is obtained with the help of computer tomography of the certified part (Stainless Steel CX cylindrical samples), through which a feature matrix is prepared, and the layers of the part are classified either having “Drift” or “No‐drift.” The model is trained with the feature matrix and tested on benchmark parts (Maraging Steel) and on an industrial part (knuckle, automotive part) manufactured in AlSi10Mg. The proposed semisupervised approach shows promising results for presented case studies. Thus, the semisupervised machine learning approach, if adopted, could prove to be a cost effective and fast approach to postprocess the in situ monitoring data with much ease.

techniques in the AM process has increased in the past few years.
This article specifically focuses on laser powder bed fusion (L-PBF), which is one of the metal AM techniques, also known as direct metal laser sintering (DMLS) and selective laser melting (SLM). The monitoring system acquires data through an off-axis Infrared camera-equipped installed in the EOS M290 machine, where bandpass filters are used to monitor the melt pool radiations eliminating the reflected laser radiations. Establishing the link between the acquired data and part quality is a challenging task. However, in recent years, many researchers have used ML algorithms and computer vision (CV) techniques on large datasets for decision making. [7,8] Grasso et al., [9] detected the over melting via a statistical comparison between layerwise emitted light intensity profiles. Some studies have also linked the statistical descriptors of spatters (spatter count and size) and vapor plume with the processing parameters and their influence on flaw formation during printing. [10,11] Khanzadeh et al. [12] proposed supervised learning-based model to predict the porosity. The pattern from the individual melt pool images is extracted to predict the probability of class labels. Recently, Okaro et al. proposed another perspective of the ML approach for data treatment and predicted the quality of the parts based on their mechanical property as a descriptor. Key features were extracted from the photodiode signals, and a semisupervised classification algorithm called "Gaussian Mixture Model-Expectation-Maximization (GMM-EM)" was applied to classify the samples as "faulty" and "acceptable" based on the ultimate tensile strength of the tensile bars. This approach showed the possibility for automatic certification of the L-PBF parts based on their mechanical properties. But Okaro et al. method does not illustrate any possibility to detect the defects in the parts, which is the cause for inferior mechanical properties. [13] Whereas, Scime and Beuth [14] studied the flaw detection using the powder bed layer images. They correlated the effect of the bed powder layer on the final part quality using a CV algorithm. Therefore, ML approaches can be promising steps to certify the quality of the final parts. It can be used as an additional verification step with currently used techniques such as computer tomography.
In this work, a commercially available EOSTATE exposure optical tomography (OT) module installed on EOS M290 is used for in situ monitoring. The acquired in situ data are postprocessed to check for drift layers in an automotive part and benchmark part using a ML-based classification approach. The "drift layer" is classified based on the localized hotspots being generated due to localized variation in melt pool signatures in a particular layer of the part, which significantly affects the final part quality. Any ML problem, in theory, can be broadly classified into two categories: 1) supervised and 2) unsupervised. In a supervised ML approach, a set of labeled data is used to train the classifier, and the trained classifier is used to predict the outcome of unlabeled data. In contrast, an unsupervised approach is used to ascertain labels through correlations/patterns in unlabeled data and is explicitly dependent on the distinction in the dataset. Unsupervised learning is also helpful for labeling the unknown data or can be used to organize the data in clusters, anomaly detection, or association. A more detailed explanation about these approaches could be found in the literature. [15,16] However, there exists an ML approach, which utilizes both the supervised and unsupervised ML approaches in a particular pipeline, called "Semisupervised learning approach." The semisupervised learning approach exists since, in many real-world applications, one cannot directly apply either a supervised or unsupervised as the availability of a fully labeled dataset is scarce. [17] The main principle behind a semisupervised ML approach is that an unsupervised learning approach is first applied to the unlabeled dataset to obtain initial labels, and to obtain learning parameters like suitable distance metric, etc. Then the newly labeled data are used to train a supervised learning approach-based classifier, a similar approach could be found in the literature. [18][19][20] This hybrid pipelined approach is beneficial for applications such as AM, where getting labeled data is a challenge.
In this study, we propose a semisupervised learning-based approach to postprocess and detect the drift in the acquired in situ OT data. First, an unsupervised learning algorithm (K-means clustering) is used to expand the available limited labeled dataset and finding the best-suited distance matric to generate a training dataset. Second, a supervised learning-based classification algorithm (k-nearest neighbor) is trained with the labeled dataset (obtained from K-means clustering) and tested with real case studies. In this study, the in situ monitoring dataset is treated as a binary classification problem, and the labeled data have labels, namely: "no-drift" and "drift," which signifies the possibility of drift in a particular layer. The following points signify the important contributions of this article: 1) A decision on the training features from the dataset based on the histogram mapping of the layerwise OT data and validated with CT data. 2) Use of K-means clustering for labeling a limited labeled dataset based on the selected data features. The cross validation with CT data is carried out to decide the suitable distance metric and other classification parameters. 3) Use of the labeled data to train/test a k-nearest neighbor (k-NN) classifier and judge its accuracy. 4) The application of the trained k-NN classifier, on case studies of complex geometry parts and cross validation of k-NN results using the EOSTATE exposure OT analysis tool.

Hot Spot
The local areas with high light intensity compared with the rest of the layer. These hotspots areas indicate the highest probability of defect occurrence in the final part.

Drift
The drift layer is an indication of the presence of the hotspots in a particular layer. If the local hotspots are present in the layer, the whole layer is termed as the "Drift layer."

No-Drift
In contrast to the "Drift layer," the nonexisting of hot spots in a particular layer, termed as "No-drift layer."

Feature
In ML, a feature is a measurable property that can be quantified and recorded. Features are extracted from the input data in a way to simplify the classification task. In our application, the mean and median are the feature extracted from each layer's image.

Feature Matrix
A n Â m dimensional matrix, where n is the number of data, and m is the number of features. In other words, a feature matrix contains n number of 1 Â m-dimensional feature row vectors.

Balanced Dataset
A dataset having an equal number of layers for each label ("drift" and "no-drift").

Background and Methods
This section confers the theoretical explanation for the used ML algorithms and in situ monitoring systems used for monitoring the process. In later sections, the link with our work will be established.

In Situ Monitoring Techniques
Despite many technological advantages in AM over conventional manufacturing techniques like casting, machining, and so on, the process capability still represents a major limitation to their industrial breakthrough. The certification standards for AM parts are still under development, and process stability is a major concern for many risk-averse industries. [21] Therefore, in recent years, both researchers and machine manufacturers have shifted their focus on developing more reliable monitoring systems to monitor the process stability and detect the possible drifts during the process. Thus, it led to the development of different monitoring systems such as layer control system, which is mainly dedicated to powder bed spreading, laser power control, melt pool monitoring, acoustic methods, and ultrasonic methods. [22,23] In this study, we will focus on melt pool monitoring systems, which mainly rely on the detection of electromagnetic radiation from the melt pool. In situ monitoring systems can be categorized into two classes principally based on their layout in the machine: coaxial/on-axis systems and off-axis systems.
The coaxial monitoring systems monitor the optical path during printing using high precision sensors such as photodiodes, high-speed cameras, and pyrometers. [23] The spectral sensitivity of these systems varies from manufacturer to manufacturer, but most of the systems have spectral detection range in the near infrared region (NIR). A bandpass filter is used to eliminate the detection of radiation from the visible spectrum and laser backscattering from the build chamber. Pavlov et al. [24] used on-axis two InGaAs photodiodes with transmission spectrum of 1.26 and 1.4 μm, respectively, to monitor the temperature of the laser spot while printing. The relation between photodiode signal response and process parameters such as layer thickness, hatch distance, and scan strategies established. Berumen et al. [25] studied the heat accumulation correlation with the varying layer thickness, sharp contours, and overhang structures using an onaxis photodiode with a spectral range of 780 nm up to 900 nm.
In contrast to on-axis systems, the off-axis systems are spatially stationary to laser and do not follow the optical path. For example, Kleszczynski et al. [26,27] studied variation on volumetric energy during printing using off-axis high-resolution CCD camera. The sensors are placed outside the optical path and have stationary field of view, i.e., the whole build plate. Therefore, the geometric correction factor is always applied to the acquired images. [28][29][30] In our study, we will be postprocessing the data from the off-axis sCMOS camera.
It is imperative to illustrate that the sensitivity of these systems depends on various factors, such as process parameters, material, and other unwanted factors. For a more detailed review of various in situ monitoring systems, interested readers can refer to the study by Grasso and Colosimo. [23] Some of the commercially available melt pool monitoring systems are listed in Table 1.

Machine Learning
The field of ML is concerned with the question of how to construct computer programs that automatically improve with experience. In this section, we will define through mathematical modeling the ML algorithms utilized in this work.

K-Means Clustering
K-means clustering or Lloyd's algorithm [31] is one of the most widely studied unsupervised ML algorithms in the literature. It is an iterative partitional clustering algorithm that divides n data/feature samples into K disjoint clusters that minimize the squared error criterion, and each cluster is characterized through a centroid. It is important to note that the initial centroid seeds play an important role in K-means clustering algorithm, as different initial centroids can provide different results. [32] We have utilized K-meansþþ algorithm [33] to choose initial centroid seeds which achieve faster convergence to a lower sum-of-squares point-to-cluster-centroid distance than Lloyd's algorithm but offers no guarantees of optimality. The parameter K (i.e., number of clusters) is a priori selected, which in our case is 2. The iterative steps for K-meansþþ algorithm are mentioned in the flow graph shown in Figure 1.
Let us define an n Â m-dimensional dataset/feature matrix D ¼ ½x 1 , x 2 , : : : , x n T with n samples, and the set of K centroids be C ¼ ½c 1 , c 2 , : : : , c K T , where x and c are necessarily represented points on an m-dimensional plane. The steps used to implement the K-meansþþ algorithm are as follows: 1) Select an observation x a uniformly at random from the dataset/feature matrix, D. The chosen observation is the first centroid denoted as c 1 . 2) Denoting the distance from an ith data point x i to jth centroid c j as dðx i , c j Þ: Now, we compute distances from each observation to c 1 , i.e., dðx i , c i Þ. 3) Now, we choose the next centroid c 2 with a probability of dðx t , c 1 Þ 2 P n j¼1 dðx j , c 1 Þ 2 . This step implies if the data point x t is near to centroid c 1 the likelihood of x t to become centroid c 2 would be negligible. 4) We now repeat step 3 until all initial K centroids get assigned, i.e., c 3 , : : : , c K . 5) Now for each l ∈ f1, 2, : : : , Kg, set the cluster ∁ l to be the set of data/feature points in D that are closer to ∁ l then they are to ∁ p , ∀l 6 ¼ p. 6) Now for each l ∈ f1, 2, : : : , Kg, update the centroid c l to be the center of mass of all data points in ∁ l , i.e., c l ¼ 1 This process is equivalent to calculate a mean, and hence the name K-means clustering. 7) Now, iteratively repeat steps 5 and 6 until cluster assignments do not change.
In K-means clustering, the distance metric d plays an important role, and in the literature, various distance metric has been suggested, e.g., Euclidean, correlation, cosine, and city block distance. In our work, we have experimented with various distance metrics to find the most suitable one for our application.

k-Nearest Neighbor Classifier
The k-nearest neighbor (k-NN) is a supervised learning classification algorithm, which is one of the most studied classifiers in the literature, and even with a simplistic formulation, it has a performance at par the most complex classifier available. [34] The k-NN learning algorithm requires labeled training data, and a predefined value of the number of nearest neighbors parameter k, which is used by the classifier to find k-nearest neighbors to a query data (unclassified data), based on a distance metric. The k-nearest neighbors can have different classes, and the algorithm predicts the class of the query data as the majority class of nearest neighbors. Let T ¼ ðx i 0 , L i 0 Þ∀i 0 ¼ 1, 2, · · · , N denote the training set, with N samples and x i 0 ∈ R m is an m-dimensional training feature vector have a known class label L i 0 .
A query data, x 0 i 0 be an m-dimensional vector of data or features, to which a label L 0 i 0 will be assigned. Now let : : : , k, denotes the set of k-nearest neighbors based on a distance metric and based on the majority class of T0, d 0 i 0 will be assigned a label, i.e.

Start
Choose an intial centeroid uniformly at random from dataset/feature matrix D.
Compute the distance d(x m ,c 1 ) between feature/data m and c 1 Select the next centroid c 2 from dataset X based on probability given by Repeat until a total of K centres chosen.
For each i {1, . . . , k}, set the cluster C i to be the set of points in X that are closer to c i than they are to c j for all  www.advancedsciencenews.com www.aem-journal.com where δ L,L k i 0 is a Kronecker delta function.

Distance Metrics
As both the supervised (k-NN), as well as unsupervised (K-means) learning algorithms, rely on a distance metric, in this section, we mathematically define them. For any two points x 0 i 0 and x k i 0 in an m-dimensional space, the Euclidean distance can be defined as City Block distance as and Correlation distance as

In Situ Monitoring Module
For our study, we used the EOSTATE melt pool monitoring module installed on EOS M290 for which the schematic diagram is shown in Figure 2. The OT comprised an off-axis scientific complementary metal-oxide-semiconductor (sCMOS) camera with a spectral detection range in the NIR. Usually, the radiation from the build chamber consisted of three components, i.e., backscattered laser (1064 nm), plasma radiations due to evaporation and ionization of gases (400-600 nm), and thermal emissions which ranged from visible (380-780 nm) to near infrared (%1400 nm) spectrum. [35] As the part quality mainly depended on the thermal emissions from the melt pool, so other wavelength radiations were eliminated by installing a bandpass filter (the type of bandpass filter cannot be relieved due to the machine provider confidentiality clause) in front of the camera.
The OT system had a camera resolution of 2560 Â 2160 pixels, which allowed achieving a spatial resolution of 125 μm per pixel across the entire build platform. The OT system had a frame rate of ten frames per second. At the end of every scanned layer, all the images were superimposed, and a holistic picture for the whole layer was saved in a 16-bit tagged image file format single image. The final image of the particular layer represented the process map that can be correlated with the emitted light intensity by the process. Due to noncentralizing position of the camera, the geometric correction was applied to the images via software.

Materials and Methods
In our study, a total of 18 Stainless Steel CX cylindrical samples with a diameter of 10 and a height of 15 mm were printed on EOS M290. The input volumetric energy density was varied over a range to prepare a dataset of certified samples that would be used further for training and verification of the model. The deliberately varied process parameters induced drift during the process which resulted in varied porosity level in the parts. The optimized printing process parameters are shown in Table 2.
The laser power and laser scan speed were varied AE30% from the optimized processed parameters shown in Table 2. Also, the layer thickness was varied to 30, 60, and 90 μm. It was worth noting that the possibility of having drift due to bad powder layer spreading was not considered for this study. To prepare a labeled dataset for training, computed tomography was conducted on printed samples for analysis using an X-ray inspection system for determining pores with a 180 kV microfocus tube and an area detector with a voxel size of 19 or 22.5 μm. The smallest evaluated pore had a volume of 5 voxels. For testing the supervised learning classifier, two case studies were chosen, which were: an  www.advancedsciencenews.com www.aem-journal.com automotive part called "Knuckle" and a benchmark part called "Overhang structure" (Figure 3). The automotive part called "knuckle" was printed with AlSi10Mg, and benchmark part was printed with Maraging steel. The "knuckle" and "overhang part" were chosen as a case study because the location of the overheating failure in these parts was detectable via visual inspection. Thus, we made the choice not to carry out CT scan for this first study. The analysis tool from EOS could indeed detect cold and hotspots, which were treated as areas of having the highest probability of finding a defect in the final part. EOSTATE exposure OT analysis tool was then used for the cross validation of results obtained from the supervised learning approach for the presented case studies.

Image Analysis
Clijsters et al. [37] used a coaxial setup (CMOS camera and photodiode in NIR range) to capture the melt pool signatures such as melt pool area and intensity to monitor the quality of the SLMed parts. Clijsters et al. prepared a dataset based on the melt pool area and intensity for different scan vectors such as fill scan and contour scan. It was presented that the heat transfer depends mainly on the environment of the melt pool: the heat flux was higher when the metal pool was surrounded by printed material than when it was surrounded by powder. For each of these classes (fill scan and contour scan), a confidence interval was defined, and errors were detected based on the defined confidence interval. The proposed method was studied for small cube structures. However, in our study, we also noticed such a confidence window based on mean and standard deviation could not be defined for complex and real-case parts. To prove this, the mean and variance of each layer for the complex part called "Knuckle" (refer Figure 3a for part geometry) are shown in Figure 4. It can be observed that the mean and standard deviation were dependent on the printed area of each layer and changes with the shape of the complex part. Therefore, the global threshold limit based on the mean and standard deviation could not be applied for the complex geometries, and thus the ML is a suitable approach for postprocessing of in situ data. For training our ML classifier, it was then required to select suitable features from the given dataset. Preprocessing of the OT images was a necessary step before feature extraction and further ML processing. As the OT image captured the whole build plate, the specific part region was cropped. The removal of the background from the cropped part images was carried out based on a suitably selected intensity threshold. Through a preliminary statistical analysis carried out on individual image layers, it was observed that the mean and the median of the corresponding images increased   [36] ), the length of this part is %500 mm, only a section was 3D printed, b) benchmark used to determine the critical overhanging angle in AM processes. www.advancedsciencenews.com www.aem-journal.com significantly due to the presence of hotspots in the thermal images when compared with the cases of absence ( Figure 5).
The presence of localized hotspots in the layer was due to local variation in the melt pool shape and size. The localized variation in melt pool signatures could be influenced by the localized variation in powder bed spreading or process parameters. The more hotspots there are in the images, the highest probability of drift. For example, Zenzinger et al. [35] demonstrated the link between the hotspots in OT images to the defect in the μCT scan of the layer. Recently, Mohr et al. [38] also studied the OT images to detect a defect in the final part and compared the OT images with the μCT images. It was also verified that the hotspots in the OT images link to defect in the final part. For example, Figure 5a, represents the layer image without any hotspots, verified with CT image as well ( Figure. 5b) and, another image with hotspots (marked by red circles) is shown in Figure 5c. The corresponding CT image with the porosity (marked by red circles) is also shown in Figure 5d. On comparing the histogram of images in both scenarios, the histogram of the image, which represented a probable drift layer (layer with hotspots is termed as drift layer), showed a right shift when compared with the image of the layer with no possible drift (layer without hotspots). The right shift in the histograms was due to the presence of hotspots, which led to higher mean in the layer compared with no drift layer ( Figure 5c).
As the relative value of the mean and median of the image pixels could describe the degree of right/left shiftiness of the histogram, based on these observations, we decided on the mean and median being the features of choice for training and testing the classifier. The flowchart of the feature extraction procedure is shown in Figure 6.

Initial Data Labeling and Suitable Distance Metric Selection Through K-Means Clustering Algorithm
Another essential aspect of ML algorithm like K-means and k-NN was the distance metric. As discussed previously in Section 3.2.3, there were many distance metrics already proposed in the literature, but for a particular application, only a few will suit. In this section, we discuss the initial labeling of sample layers as "drift" or "no-drift" and choose the suitable distance metric for our application based on the experimentation done on the certified dataset.
Due to the complex nature of the DMLS process, obtaining the labeled dataset in an automated fashion is a challenging task. Therefore, initially, a labeled dataset of 40 layers (20 layers of each label, i.e., "drift" and "no-drift") from a set of 14 Stainless Steel CX cylindrical samples were prepared manually. The categorization of specific layers as "drift" and "no-drift" was based on the visual comparison between of particular OT image with the corresponding CT image ( Figure 5). However, the resolution difference between OT and CT images hindered a direct visual comparison of all the images. Thus, it was not possible to label and prepare a large dataset manually. Therefore, in this case, the unsupervised algorithm K-means clustering come to the rescue and was used to automate the task of labeling. As seen in  www.advancedsciencenews.com www.aem-journal.com comparison with CT data, it could be concluded that a higher number of OT hotspots results in a higher probability of having a real defect in that layer. Based on this hypothesis, a dataset of another 200 unlabeled layers was prepared such that it comprises an approximately equal number of "drift" and "no-drift" labels. Nevertheless, it shall be noted that it does not mean that every image with an OT indication will lead to a real defect in the printed layer due to the repetitive nature of printing of the DMLS process, as the existing defect in the previous printed layer may be resolved in the next printing scan. Next, the 40 labeled layers and 200 unlabeled layers were mixed up to perform a cluster analysis. We extracted m ¼ 2 features (mean and median) of each layer image, and a total of n ¼ 240 feature vectors were formed, resulting in a feature matrix D of dimension 240 Â 2. The clustering parameter K ¼ 2 (as the data were divided into two clusters) and maximum iterations to 1000 was set. The clustering process was repeated five times using new initial cluster centroid positions/seeds. The final clustering solution would be the one having the lowest sum of points to centroid distance (ref. Section 3.2.1). The accuracy of K-means algorithm cluster assignments was cross validated with the initial 40 labeled samples utilizing multiple distance metrics, and the Correlation distance metric showed the maximum validation accuracy. The high validation accuracy also indicated that the selection of mean and median as features were sufficient to differentiate between "drift" and "no-drift" conditions. The cluster assignment using different distance metrics on the expanded dataset is shown in Figure 7. The validation accuracies of the different distance metrics are shown in Table 3. Based on the highest percentage accuracy achieved through the K-means method, we chose the correlation distance as the suitable distance metric for our application and finalized the resultant labels for training further a supervised learning-based classifier, i.e., k-NN classifier.

Number of Nearest Neighbor Parameter Selection for k-NN Classifier
After identifying the suitable distance metric, the next important step was to prepare a balanced training dataset (dataset with an equal number of data of each label) to train a k-NN classifier (a supervised ML algorithm). It was imperative to prepare a balanced dataset to avoid the biasing problem. Therefore, 100 data points for each label (total 200) were chosen  randomly from the dataset of 240, which resulted from the K-means clustering. Although we prepared a balanced dataset for training, one can apply methods like "Class confidence weighting" in case only an imbalanced dataset is available. [39,40] The training dataset was divided randomly into a 70:30 ratio for preparing training and validation dataset. To find the best value for the number of nearest neighbors, i.e., k, the accuracy of the trained k-NN classifier with varying k was tested on the validation set, which is also shown in Figure 8. The accuracy versus k graph plot suggested that a value of k ∈ ð9, 23Þ, could be chosen for the maximum classification accuracy. As choosing a large value of k results in the increase in computational complexity of k-NN classifier, we chose the minimum indicative value k ¼ 9. The overall working pipeline of our semisupervised learning model is shown in Figure 9.

Testing on Certified Data
In the Experimental Section, the stainless steel CX cylindrical samples were printed with different process parameters such as layer thickness, power, scanning speed, and hatch distance to generate a certified dataset for training the algorithm. The certification of the samples was done through a CT scan. A total of four certified cylindrical samples with CT porosities of 0.0033%, 0.0054%, 4.3967%, and 0.1436% were chosen for testing the trained k-NN classifier. It should be noted that the data of the four chosen cylinders were not used to train the classifier. The semisupervised model predicted 20.82%, 1.9543%, 1.1363%, and 0.4862% layers as "drift" for the sample with CT scan porosities of 4.3967%, 0.1436%, 0.0054%, and 0.0033%, respectively. The k-NN classifier-based test results indicate that the sample with CT scan porosity of 4.3967% has the highest number of layers with hotspots, which is due to nonoptimized processing parameters. As the dataset size is limited, further comments on an analogy between porosity and drift detected (particularly at lower porosity levels) are out of the scope of this work.

Case Study: Benchmark Part
We now apply our trained k-NN classifier on the data obtained through real complex parts, and for that, we choose benchmark parts that are often used for material development in the industry. Out of different benchmark parts, "Overhang" is one of the critical parts which is shown in Figure 10a, which aims to find the critical overhang angle for a particular material that can be printed without the support structures. As we know, above a critical overhang angle, the support structures are used as an anchor to prevent the failure of the part. Therefore, it is vital to know the overhang angle for the optimization of process parameters for new material development in the DMLS process. The surety of the failure of the part at a particular overhang angle makes this part as a suitable candidate for our case study. As the location of the failure of this part is known and can be used as ground truth label. The OT images of the part exported and preprocessed as described in the previous sections, and the feature matrix is prepared. As mentioned in the previous section, the k-NN classifier is already trained with labeled data of the certified part, and the predicted labels are shown in Figure 10b. The location of the layers labeled as "drift" is shown in Figure 10c, the poor thermal conductivity due to a large overhang angle has led to the overheating drift in the layers numbered from 1535 to 1564. This hypothesis is validated by the EOSTATE exposure analysis tool, which shows the presence of hotspots in the "drift" labeled layers. Thus, the labels predicted by the ML model are in correlation with the results from the analysis tool as well.

Case Study: Industrial Case Automotive Steering "Knuckle."
For another case study, we choose an automotive part called "Knuckle." Figure 11a shows the scatter plot of resultant predicted labels, and the last few layers of the part are classified    as "drift" layers. This classification is in correlation with the visual inspection of the part quality as well. As shown in Figure 11b, the topmost layers from layer number 7682 until 7764 of the part failed due to overheating. The reason for the failure of these layers is due to the lack of heat dissipation. This part acts as a good example to show the need for design optimization in AM as poor support structures lead to part failure.
It was also be verified by the EOSTATE exposure analysis tool that the overheating leads to hotspots in the captured images. Another optimized design knuckle was printed with optimized support structures for better heat dissipation. Figure 12a shows the predicted labels, and it shows that only eight layers classified as "drift" layers, which is due to presence of hotspots in the layers numbered from 6150 to 6158. This hypothesis is cross validated www.advancedsciencenews.com www.aem-journal.com with the EOSTATE exposure analysis tool, as shown in Figure 12c, which shows the location of the hotspot in the layer number 6150. It shall be note that the hotspot continues to be present at the same location for next eight layers (from layer number 6150 to 6158) which is also predicted by the proposed k-NN approach. The automotive part called "Knuckle" shown in Figure 12b is printed within the framework of the European Union project titled "MAESTRO." [36] The initial tests utilizing the proposed semisupervised ML model are carried out for three different parts, which were printed with different process parameters and different materials (Benchmark part-Maraging Steel, Knuckle-AlSi10Mg). Therefore, it can be concluded that the proposed algorithm and feature extraction is independent of process parameters and material of the printed part.
It is also to be noted that the OT is not sensitive to all kinds of defects that could occur during the printing process. As the DMLS process is very complex in terms of printing, there are hundreds of factors that could affect the quality of the final part. For example, Galy et al. listed all the different parameters that can influence the quality of the Al alloy parts during printing. [41] Therefore, it is challenging to monitor all the possible factors with OT, and certification for quality assurance requires expensive techniques like a CT scan. However, the CT scan is a costly technique, and part size limitation makes it more difficult to use it for every part certification. Therefore, the proposed model based on ML can be used to select good parts for the further expensive postprocessing techniques.

Conclusions
In situ monitoring of the DMLS process can significantly improve the reliability of the whole process for quality assurance of the product. But there are few issues with the monitoring systems which need to be solved for fast and easy anomaly detection. First, processing the enormous amount of unlabelled data obtained from these monitoring systems is a huge challenge. Second, it is very laborious to detect the drift in the final part of the in situ monitored data. Therefore, the use of ML algorithms for the treatment of in situ monitoring data has merit. This article is a continuation of the research going into the field of data treatment in AM.
The key contributions of this work are summarized as follows: 1) An unsupervised K-means clustering algorithm was used to label the unlabelled data with the help of small set of labelled data and helps in choosing the most suitable distance matric. The accuracy of the algorithm was verified using the computer tomography-based certified data. 2) The labeled dataset was utilized to train the k-NN classifier. The optimum value of number of nearest neighbors k was judged on the basis of accuracy versus k plot. 3) The semisupervised approach successfully classifies the layers into either "drift" and "no-drift" for presented case studies and the cross validated through the EOSTATE exposure OT analysis tool.
The future aim of this work is to study the possibility to detect the exact location of the drift in the specific layers and to check the reliability of this model for other complex parts/materials as well. The coupling with data from other sensors, such as photodiodes, must to be studied. The semisupervised ML shows that ML in AM can be a robust method to improve the postprocessability of the in situ data. The final aim of this approach could be achieved by real-time monitoring of the whole process in a closed feedback control manner: if drift in the process is detected, different printing parameters could be used to avoid part failure.