Robust and Fast 3-D Saliency Mapping for Industrial Modeling Applications

New generation 3-D scanning technologies are expected to create a revolution at the Industry 4.0, facilitating a large number of virtual manufacturing tools and systems. Such applications require the accurate representation of physical objects and/or systems achieved through saliency estimation mechanisms that identify certain areas of the 3-D model, leading to a meaningful and easier to analyze representation of a 3-D object. 3-D saliency mapping is, therefore, guiding the selection of feature locations and is adopted in a large number of low-level 3-D processing applications including denoising, compression, simplification, and registration. In this article, we propose a robust and fast method for creating 3-D saliency maps that accurately identifies sharp and small-scale geometric features in various industrial 3-D models. An extensive experimental study using a large number of 3-D scanned and CAD models verifies the effectiveness of the proposed method as compared to other recent and relevant approaches despite the constraints posed by complex geometry patterns or the presence of noise.

increased, resulting in the interest for more accurate 3-D model processing.
The resolution and accuracy of the modern 3-D scanners are constantly increasing, making them even more attractive in several vision-based manufacturing tasks, allowing the accurate generation of dynamic virtual representations of physical objects which are then used for inspection. Inspecting the parts and repairing the damages or degradations are very basic tasks for many engineering or manufacturing products. More specifically, surface defect inspection is of primary importance for engineering part quality inspection, since surface defects affect not only the appearance of parts, but also their functionality, efficiency, and stability. This task mostly depends on human visual inspection by skilled inspectors. Human visual inspection is costly, labor-intensive, time-consuming, and prone to errors due to inspectors' lack of experience or fatigue, bad environmental conditions, etc. Hence, automatic inspection of the surfaces using computational techniques, which is faster, more consistent, and robust, is highly desired [5]. We are motivated by the fact that there are a lot of new-era industrial applications that require the digitization of physical objects or systems (e.g., inspection, digital twin, Industry 4.0, quality control, reverse engineering, etc.) creating or using already scanned 3-D objects. However, this digitized information is massive and raw, leading to the need of new essential and meaningful identification of features that will facilitate robust processing in various applications. These facts stress the need to focus on the development of computational models of visual attention, whose well-known outcomes are the saliency maps. Saliency maps are compact 3-D representations, generated by simplifying, annotating, and/or changing the representation of a physical object/system giving more emphasis to geometrically meaningful parts. The salient features also typically satisfy important requirements such as scaling, rotation, and resolution invariance that can simplify industrial processes. In this article, we focus on providing a method for the accurate extraction of a meaningful 3-D saliency mapping ideally suited for industrial 3-D models. More specifically, the contributions of the proposed approach can be summarized as follows.
1) It is designed to use both normals and guided normals, depending on the application. Guided normals provide more robust saliency mapping in cases of meshes affected by scanning noise and imperfections where other methods fail.
2) It combines the benefits of a spectral and a geometric approach in a single unified approach.

3) It exploits both local and global information of a model.
In other words, the spectral method and the rows of the coherent matrix E enclose local information while the columns of the coherent matrix E enclose global information corresponding to larger patches. 4) It uses the same configuration parameters independently of the input values for any model, making it 3-D model agnostic, without the need for further modification. 5) It has low computational complexity, especially in comparison with other spectral methods that use the connectivity information of a model. 6) It can be ideally used in applications related to industrial 3-D models which have some special geometric characteristics (e.g., intense corners and edges) or have been affected by complex noise patterns. 7) It is easily adaptive and can be efficiently adopted both by low-level applications (e.g., denoising, compression, registration, etc.), as a preprocessing step and/or by highlevel industrial applications (e.g., maintenance, inspection, quality control, digital twin technologies, etc.) 8) An extensive performance assessment, using a large collection of different industrial 3-D models, clearly shows the robustness of our method as compared to other approaches. The rest of this article is organized as follows. Section II presents state-of-the-art methods and related work. Section III presents some basic definitions and preliminaries. Section IV describes in detail the work flow of the proposed method and we show how our method can be used in a variety of actual industrial applications. Section V presents the experimental results and evaluation in comparison with other methods. Finally, Section VI concludes this article.

II. RECENT WORKS
Visual saliency is a subjective perception cue that differentiates a region from others and immediately attracts human attention [6]. The human visual system (HVS) is evolved to automatically detect salient regions over the entire field of view [7]. It is first attracted by the most representative salient elements and then the visual attention is transferred to other regions [8]. Most of the existing methods try to simulate the way that the human perceptual system works, giving more emphasis to what the human brain assumes as salient information. Nevertheless, what a human assumes as a salient feature may vary from what computational methods assume as salient. On the other hand, in industrial applications, simple geometry is usually more common and useful compared to complex surfaces of high spatial frequency that would trigger human visual attention. Wei et al. [9] presented a 3-D saliency mapping mechanism using the curvature co-occurrence histogram, following similar steps with the method proposed in [10] for extracting salient image features. Tao et al. [11] proposed an entropy-based saliency approach using the entropy of the normals to depict the local changes in a region. Song et al. [12] proposed a method which incorporates global considerations by making use of spectral attributes. An et al. [13] proposed a hybrid saliency taking into account both color and geometric information. Nouri et al. [14] proposed a saliency-based metric for the evaluation of the quality between an original and a distorted 3-D mesh comparing the structural information. They also [15] proposed a saliency method, using a local vertex descriptor that is used as a basis for similarity measurement and integrated into weighted multiscale saliency features. Zhao et al. [16] proposed a saliency detection method by diffusing a shape index field with a nonlocal means filter. Their algorithm generates a random center-surround operator to create a saliency map and use the Retinex theory to improve the saliency map. Wu et al. [17] proposed a 3-D saliency map estimation considering both local contrast and global rarity. There are also many other methods that estimate a saliency mapping but in a completely different way. For example, the salient mapping of method [18], which might be excellent for tactile-focused applications, is, however, far from acceptable if applied in industrial applications like the ones demonstrated. Liu et al. [19] detect salient regions on the mesh based on the multiscale Laplacian fairing results in order to use them for head pose estimation. What they assume as a salient region in their application is significantly different to what we assume as salient features in our application. Kobyshev et al. [20] try to automatically find landmark buildings in a city using a context-dependency saliency mapping. In [21], the authors propose a saliency detection algorithm for large-scale colored 3-D point clouds, which exploits geometric features and color features together to estimate the saliency in colored point clouds.
A major drawback of the aforementioned methods is that their robustness is significantly deteriorated when applied to scanned 3-D models that have been affected by noise, outliers, or missing parts. Additionally, most of these works provide only visual maps to show their effectiveness and none of them have been used and evaluated in real industrial applications.

III. PRELIMINARIES
In this section, we present the basic definitions and preliminaries which are necessary for the complete understanding of our assumptions [22].

A. Basic Definitions of 3-D Meshes
In this article, we use triangle meshes M consisting of n vertices v and n f faces f . A vertex v i is represented by Cartesian coordinates, denoted by

B. Robust Principal Component Analysis (RPCA)
RPCA has been used in order to decompose an observed measurement E into a low-rank matrix L representing the real data and a sparse matrix S representing the noisy data by solving: arg min L,S L * + λ S 1 , s.t. L + S = E, where L * is the nuclear norm of a matrix L (i.e, i σ i (L) is the sum of the singular values of L). Despite the effectiveness that some works [23] have presented in the past, their execution times need improvement. The computational complexity is a crucial issue, especially for use in industrial applications. We handle this convex problem using a very fast approach, as described in [24], according to arg min In where u denotes the singular values and is a small threshold, then the rank is increased by one (i.e., K = K + 1) and (3) is (4)
Each patch represents a small area consisting of k neighboring faces, such as The k geometrical nearest faces of the face f i are estimated by the k nearest neighbors (k-nn) algorithm (where we set k = 25). The main purpose is to find which one of these candidate patches P ij is the ideal representative area for the face f i , in terms of the direction of the centroid normals. The parameters that are investigated for the identification of the optimal patch are as follows: i) the maximum distance between the i centroid normal and the other centroid normals of the same patch [see (6)] and ii) the mean saliency φ ij based on the salient weights s of a patch P ij [see (7)]. Among all candidate patches, the ideal-selected patch is the one P * with the smallest value of (5) ∀ i = 1, . . . , n f , ∀ j = 1, . . . , n p . Examining (5), we can observe that the ideal patch has a similar direction of normals (small distance ω between normals) and lies in a flat surface area (the smaller the mean saliency φ, the less the salient features). Finally, the guided normal g i is estimated as the weighted average normal of this ideal patch P * where A j represents the area of face f j . Fig. 1 briefly presents the pipeline of our approach. We start by separating the whole mesh into n f (i.e., equal to the number of centroids) overlapped and equally sized patches. Then, we estimate the spectral and geometrical saliency and, finally, we combine these two values. Once the saliency mapping of a mesh has been estimated, it can be used in several different industrial applications, facilitating several processes in manufacturing, maintenance, inspection, and repairing.

A. Geometry-Based Saliency Analysis
The geometrical saliency features are estimated by exploiting the sparsity of the guided normals. Centroid normals n c can be also used in this analysis; however, guided normals have a more robust behavior in the presence of scanning noise [27]. The estimated patches P i are used for the construction of matrix Then, we apply the RPCA approach to this matrix, as described in Section III-B, taking advantage of the geometrical coherence between neighboring guided normals. By the decomposition, the low-rank L and sparse S matrices are estimated. However, the estimation of the geometric saliency feature s 1i of the centroid c i requires only the values of the first column of the sparse matrix, according to where S i1 x denotes the scalar value of the x coordinate, of the ith row, of the 1st column, of the S matrix.
The motivation for exploiting the sparsity of the guided normals is based on the observation that the similarity of the normals between neighboring triangles is an index of geometrical coherence of the triangles. Low values of the sparse matrix mean that the normals of a triangle and its neighbors are similar (low-rank), so if all triangles of a neighboring area have similar geometrical behavior, this means that this patch represents a flat area. On the other hand, if there is a big dissimilarity, this means that the surface has an abnormal shape.

B. Spectral-Based Saliency Analysis
For each face f i of the mesh, we use E i ∈ R 3×(k+1) , representing the i row of the matrix E in (9) Then, R i = UΛU T is decomposed to a matrix U, consisting of the eigenvectors, and a diagonal matrix Finally, the spectral saliency s 2i of a centroid c i is denoted as the value given by the inverse l 2 -norm of the corresponding eigenvalues Observing (13), we can see that large values of the term λ 2 i1 + λ 2 i2 + λ 2 i3 correspond to small saliency features, indicating that the centroid lies in a flat area, while small values correspond to large saliency values, characterizing the specific centroid as a feature. This can be easily justified by the fact that the centroid normal of a face lying in a flat area is represented by one dominant eigenvector, the corresponding eigenvalue of which has a very large value. On the other hand, the centroid normal of a face lying in a corner is represented by three eigenvectors, that correspond to eigenvalues with small but almost equal amplitude.

C. Estimating the Saliency of Vertices
We then normalize spectral and geometric saliency in a range [0-1], according tō For the sake of completeness, we denote the saliency mapping as the weighted combination of the normalized geometricals 1 and spectrals 2 saliency features, according to where w 1 and w 2 are the corresponding weights which can be tuned for giving emphasis to the one or the other approach. However, we suggest the use of w 1 = w 2 = 1 which is also used in all of our experiments. The proposed method is robust, even when we assume complex surfaces with different geometrical characteristics, since it exploits spectral characteristic (i.e., oversensitivity in the variation of neighboring centroid normals) and geometrical where N i represents the first-ring area of the vertex v i .

D. Speed-Up Process-Sampling Matrix E
Besides the fact that we use a very fast variants of RPCA, the most time-consuming step, still there is decomposition of the coherent matrix E into a low-rank and a sparse matrix. The computational complexity of this method is related to the size of the data, so an approach to decrease the execution time is to use without, of course, reducing the detection accuracy. To achieve this, we first use the saliency mapping results of the spectral method. The salient map helps us to make a first coarse estimation about where the sharp features and the flat areas exist.
We start by assuming that each vertex can be categorized into a saliency class, based on its salient value that has been extracted by the spectral method only. We use 64 classes in total which is equal to the number of different colors of the "jet" colormap that is also used for the visualization of the saliency mapping. Class 1 consists of the least salient vertices, while class 64 consists of the most salient vertices. Then, we are based on the observation that a large quantity of vertices belong to class 1 (as we can also see in the histograms of Fig. 2) and we exclude these vertices. Finally, we create a smaller dimension matrix E ∈ R 3n f ×(k+1) using all the vertices of the rest of the classes  where n f < n f .
The final execution times of the fast approach are further improved and the speed-up of the new algorithm is up to ∼ 85%, as we can see in Table I. It is also presents the results of the saliency mapping in different models using the two presented approaches, namely the original RPCA and the faster approach applied only to the salient vertices of the mesh. As we can observe, there is no big perceptual difference between the two results.

E. Utilizing 3-D Saliency Mapping in Industrial Applications
In the following subsections, we present indicatively some industrial applications in which the proposed saliency mapping can be utilized, facilitating several visual tasks.

1) Utilization in the Manufacturing Industry for Quality Con-
trol Inspections: It is very common, in the manufacturing industry, objects to be produced in different sizes, retaining however the same form with the prototype model. Nonetheless, to assure quality, the reconstructed objects must satisfy a range of statutory and contractual obligations. In this case, inspection is used to verify and certify that the new scaled object has been manufactured in full compliance with all specified requirements and constraints. In Fig. 3, we present examples of inspection between real-scanned industrial objects, denoted as prototype models 1 2 3 [ Fig. 3(a)], and their corresponding scaled and deformed 3-D objects [ Fig. 3(b)]. Our purpose is to inspect if the new manufactured 3-D object has the exact same design details as the original (regarding the fidelity of its form) and also to ensure that it has not been affected by irregularities encountered during the manufacturing processes. In Fig. 3(c), we present an enlarged representation of the scaled model, presented in Fig. 3(b), with red cycles that specify the deformed areas. The purpose of this application is to automatically identify deformations or other abnormalities from the surface of the manufactured 3-D object in comparison with the original model. For easier comparison, we provide a heatmap visualization of the difference between the original and the constructed model. Blue color means that there is no difference between the compared models while red color indicates a big difference. Our method is able to find and highlight possible differences between two objects with similar shapes comparing the saliency values of their surface. In this way, it is capable to automatically inspect degradations of the surface standards of manufactured objects despite the constraints posed by scaled manufactured objects 1 "Aeronautics actuator casting" model. 2 "Automobile Hubcap" model. 3 "Oil pump" model.  we used both HD and the salient values [ Fig. 3(e)] according to (16). We assume that we have two normalized 3-D models M 1 ∈ R n 1 ×3 and M 2 ∈ R n 2 ×3 , where n 1 = n 2 (e.g., original and compared respectively). First, for each vertex of the these two models, we create a representative vector consisting of the values of its coordinates and its saliency.
Then, for these indices d * i , we estimate the saliency difference (s 2i − s 1˜d * i ) ∀ i = 1, . . . , n 2 and we visualize it. The experiments verify that the proposed approach can identify deviations easily and with great detail.

2) Utilization for the Creation of Digital Twins and Aging
Inspection: The proposed method supports detecting changes that can be caused by aging, comparing the saliency mapping of a 3-D object having been acquired in two or more different temporal moments. In this way, our approach could be used to identify surface differences of the same object, affected by mechanical stress (e.g., a gear of a machine) or deteriorated due to environmental conditions (e.g., an ancient statue or columns). In Fig. 4, we present visual representations of the same gear in four different occasions (i.e., in an early stage and after three consecutive temporal moments). This figure shows that our method is able to capture differences due to aging, so indiscernible, that even the human eye could not easily notice.  Heatmap visualization of saliency mapping using (c) curvature co-occurrence histogram [9], (d) entropy-based salient model [11], (e) mesh saliency via spectral processing [12], and (f) the proposed method.

3) Utilization in the Heritage
of the saliency mapping can be utilized so as to automatically detect cracks and other defects on their surfaces. The presented figures verify that the proposed 3-D saliency mapping approach is very useful since it can be used for identifying areas of the original model that need to be repaired, facilitating the work of the experts during the maintenance process. Our method uses small patches of neighboring vertices; thus, if an abnormality appears somewhere in the surface, then our algorithm is able to recognize it and highlight this specific area. The higher the abnormality, the higher the value of salience, so it is ideal for inspection of cracks, damages, etc. In this way, this method could be used as a preprocessing step for the creation of a digital replica of the original cultural object, without imperfections, since it highlights the areas that need repair (i.e., digital repairing is also available). The recent trend for digitalization and creation of digital twin models has a lot of historical interest in the heritage industry. A digital repaired 3-D model can be used for the VR/AR representation of a heritage object (e.g., for educational purposes) showing how it looked like originally and additionally giving to the visitors the opportunity to see reconstructed views of the object.

V. EXPERIMENTAL ANALYSIS AND RESULTS
The proposed saliency mapping was evaluated using a) heatmaps visualization, b) 3-D mesh simplification based on the saliency of the vertices, and c) a denoising application using the saliency values for finding the ideal patches.  6. (a) Original model, and heatmaps visualization of saliency mapping based on (b) the eigenvalues of small patches (spectral analysis), as described in Section IV-B, (c) the RPCA approach (geometrical analysis), as described in Section IV-A, (d) Wei et al. [9], (e) Tao et al. [11], (f) Lee et al. [28], (g) Song et al. [12], (h) Guo et al. [29], (i) Song et al. (CNN) [30], and (j) our approach. It should be emphasized that, in most cases, there is no ground truth saliency map or a reliable metric that can be used for benchmarking purposes. The typical way to evaluate a saliency map is via subjective evaluation. The subjective evaluation can clearly show if a specific saliency mapping has achieved its purpose, applied in a specific application, and provides a fair comparison with the results of other salient mapping methods.  (14). The used colormap for the visualization is the "jet" contenting 64 colors (deep blue = 0, deep red = 1). Saliency mapping of a 3-D object must provide visual information that can be easily recognizable. This means that different areas with different characteristics will be highlighted with a different color. Fig. 8. Ideal patch selection based on the proposed saliency mapping. Fig. 9. Heatmaps of saliency mapping and denoising results using the methods. (a) Curvature co-occurrence histogram [9]. (b) Entropy-based salient model [11]. (c) Mesh saliency [28]. (d) Mesh saliency via spectral processing [12]. (e) Point-wise saliency detection [29]. (f) Mesh saliency via CNN [30]. (g) Our approach.
On the other hand, different areas with the same characteristics will be highlighted with the same color. The experimental results show that our method [in Fig. 6(e)] successfully follows this direction providing more robust and meaningful results than the other approaches. More specifically, the highest values (red colors) represent very distinctive vertices (e.g., corners), while the lowest values (blue colors) represent flat areas. 2) Simplification Based on the Saliency of Vertices: Due to the easiness of creating digital 3-D content nowadays, a great amount of information can be captured and stored. The information, acquired by 3-D scanners, is usually huge, creating dense 3-D models that are very difficult to be efficiently handled by other applications (i.e., high computational complexity). This information must be simplified, keeping only of the most representative information, and removing least important information. Simplification is a low-level application that focuses on representing an object using a lower resolution mesh without errors or with errors that cannot be easily perceived. The main objective of a successful simplification approach is to remove only those vertices which do not offer significant geometric information to the simplified 3-D object and their removal will not change significantly the shape or perceptual details of the 3-D object. Following this line of thought, we suggest to remove the least perceptually important vertices, preserving only the most salient vertices for the reconstruction of the new simplified 3-D model. More specifically, the steps of the suggested simplification process are as follows: i) All vertices are sorted based on their salient values; ii) the Kth vertices with the higher salient values remain; iii) the rest n − K less salient vertices are removed and the k-nn algorithm is used for the recreation of the new connectivity (triangulation). Fig. 7 presents the simplified meshes under different simplification scenarios.
3) Feature-Aware Denoising Based on the Saliency of Vertices: Guided normals filtering has been used in the past works [26], [25] providing excellent denoising results. In [26], the saliency is estimated using the difference between the normals of the two incident faces. We follow the same line of thought but we use a different way for the estimation of the ideal patch. More specifically, we select the patch that has the smallest value of Ψ, according to (20) and (21), since it consists of "less salient" faces (flat areas that are depicted with deep blue color).
In these examples, we show that the selected ideal patch is the one with the lowest value of Ψ (i.e., Ψ = 0.32 and Ψ = 0.37), representing the area with the less salient features. As we can observe, both the first and the last patches represent totally flat areas; however, they do not have the same Ψ value since the first patch consists of more salient triangles in comparison to the last patch, so the last area is more preferable to represent the ideal patch. We also can observe that our method provides reliable results of saliency mapping even under the presence of noise, which makes it ideal for use in applications with noisy 3-D models. The purpose of this example is the estimation of the most representative centroid normal (i.e., guided normal) in order to use it for a more efficient bilateral filtering [31].
Finally, the denoised normalsn c are used to update the vertices according to [32] v where a, b represents the inner product of a and b. Note here that we do not search for ideal parameters per each 3-D model or method. Instead, in all the experiments and for any of the approaches, we use exactly the same values for each parameter. Specifically, we define σ 2 = 0.25, 15 iterations for the bilateral filtering equations (22) and (23), and 20 iterations for the vertex updating (24). The ideal selected patch must consist of normals with similar direction (in order to satisfy the normals' consistency). The patches that have a lot of corners or edges must be banned (i.e., high salient values in our case) since they consist of normals lying in different directions. As a result, the value of Ψ would be totally misleading since it would not represent a specific planar area. Fig. 9 presents the denoising results with enlarged regions for easier comparisons. The quality of the reconstructed models is evaluated using the metrics: i) θ representing the mean angle between the normals of the ground truth and the reconstructed faces and ii) the HD.

VI. CONCLUSION
We presented a 3-D feature-aware saliency estimation approach, taking into account both spectral and geometrical information of a 3-D object. The purpose of this research was to provide a meaningful 3-D saliency mapping which could be beneficial for industrial applications. Extensive evaluation studies, carried out with several evaluation scenarios (e.g., heatmap visualization for visual perception, simplification, and denoising), verified the superiority of our approach as compared to other state-of-the-art approaches. We also presented a variety of actual industrial applications (i.e., manufacturing inspection of scaled object, inspection of aging mechanical parts, and facilitation of heritage repairing/maintenance) in which our method can be successfully utilized, in different industrial areas (i.e., manufacturing, heritage, and medical).
The saliency mapping of our method did not just detect defects, but it highlighted areas with high-frequency spatial components (which means sharp features, noise, and abnormalities) and areas where a neighborhood of normals have a random variance behavior (normals of this neighborhood do not have a prevailing direction). In other words, our approach highlighted areas where the range distribution of normals is high. In Fig. 10, we presented boxplots showing the standard deviation of the normals for all overlapped patches which have been categorized (into eight categories) based on the salient values of each vertex. More specifically, we estimated the standard deviation of the normals of each patch E i ∀ i = 1, . . . , n σ i = std(E i ) ∀ i = 1, . . . , n f (25) where std(A) represents the equation that estimates the standard deviation of the values consisting of A. Then, we categorized each vertex based on the saliency values of our method into eight categories.
As we can see, categories consisting of less salient vertices, representing flat areas, have smaller mean values of standard deviation since all the normals of a neighborhood have a common direction and form. On the other hand, as we move to categories including more salient vertices, the mean value of the standard deviation increases, meaning that the directions of the normals of a neighborhood become more irregular.