BIM-based construction quality assessment using Graph Neural Networks

- Automated construction quality control and as-built verification often involve comparing 3D point clouds captured on-site with as-designed Building Information Models (ad-BIM) at the individual element level. However, signal noise and occlusions, common in data captured from cluttered job sites, can negatively affect the performance of these methods that overlook the semantic relationships between elements. In this paper, we introduce a novel approach to automated quality control that enhances element-wise quality assessments by exploiting semantics in BIM. The proposed method represents ad-BIM as a graph by encoding elements’ topological and spatial relationships. Exploiting this representation, we propose a Graph Neural Networks (GNNs)-based algo-rithm to infer element-wise built quality status. Our method significantly outperforms classical methods and allows for inference on partially observed or unobserved elements.


Introduction
With the growing adoption of BIM within the construction industry, quality control processes are being automated to optimize cost and improve efficiency [1]. This is accomplished by comparing reality capture data (laser scanning, photogrammetry) to the as-designed BIM (ad-BIM) to detect errors in construction, thereby preventing costly downstream effects that subsequently affect the project cost and schedule [2].
Current approaches to automated quality control report element-wise quality where the as-built status of the building elements are evaluated in isolation [1,3,4] without considering the surrounding context. Irrespective of the reality capture system used, the data captured by these devices are plagued by noise and occlusions, which lead to inaccurate assessments. On cluttered construction sites, the reality capture data are affected by occlusions, sensor noise, weather conditions, and material properties, thereby resulting in the partial observability of the built structure shows the ad-BIM for an institutional building, with element-wise construction quality status labels. The bottom displays the same ad-BIM represented as a graph to leverage semantic relationships between elements to enhance as-built quality assessments.
( Figure 2). To cope with these sources of noise, it is essential to consider the semantic (e.g., spatial, topological, and temporal) relationships between building elements to enhance the quality control assessments.
In this paper, we present BIM-GNN, a novel method for detecting the built quality of building elements on a construction site using Graph Neural Networks (GNNs), incorporating semantic information extracted from ad-BIM. Our contributions are twofold. Firstly, we propose a method for constructing a graph representation of buildings from their ad-BIM (Figure 1), which is amenable to graph-based inference. In this representation, the nodes 40th International Symposium on Automation and Robotics in Construction (ISARC 2023) represent BIM objects, and the edges represent their topological and spatial associations. Secondly, we propose a graph inference algorithm that utilizes the graph structure and the features computed on its nodes to classify the building elements using Graph Attention Networks [5] into one of four quality classes, namely, verified -element built within the desired tolerance, deviated -element built outside of desired tolerance, missing -element not built and no data -not enough information to make an assessment. Our experiments in Section 4 show that leveraging our graph representation allows our inference algorithm approach to significantly outperform existing element-based methods.

Background
In this section, we briefly review what most automated quality control techniques have in common and why we need to revisit the problem. Then, we provide a short overview of graph neural networks, graph representation of buildings, and the applications of GNNs in the architecture, engineering, and construction (AEC) domain.

Automated construction quality control
Automated quality control of construction projects is a well-studied area of research [6,7]. Typically, solutions to this problem involve comparing the ad-BIM to 3D point clouds of the construction site captured with cameras [8] or laser scanners [7]. These comparisons can be done manually or using machine learning techniques. The machine learning-based solutions often treat the individual building elements as independent and identically distributed (iid) data which are fully observed. This is seldom the case, as the data are prone to noise and occlusions and the building elements are semantically linked. In this paper, we deal with the problem of partial observability by leveraging the graph structure and semantic relationships between elements of the ad-BIM to make robust quality control assessments of individual building elements.

Graphs and graph neural networks
As our ad-BIM is represented as a graph, we need learning algorithms that are capable of operating on noneuclidean permutation invariant data [9]. Most graph learning algorithms map the information the graph represents to a vector in a high-dimensional linear embedding space. These learned vectors can then be used for downstream learning tasks such as classification, regression, clustering, and generation. Graph Neural Networks (GNNs) typically use message-passing to update the representation of a node by processing its neighborhood's embedding from a previous layer to update the representation of the node in the current layer, as shown in Figure 3. The set of nodes that are directly connected to a given node via an edge is referred to as its neighborhood, which can include the node itself. In other words, GNNs with layers allow nodes to capture information within their -hop neighborhood [10]. These networks take a graph as input and compute node embeddings through a series of non-linear transformations, allowing for prediction at the node, edge, or graph level. Many GNN architectures have been proposed that vary in their definitions of message, transformation, and aggregation operations. These architectures also differ in how they stack layers using different graph manipulation techniques to accomplish supervised or unsupervised tasks at the node or graph level.

Graph representation of BIM
Graph representation of buildings has been used for accessibility analysis [11], generative design [12], and large BIM file processing [13]. These representations capture 40th International Symposium on Automation and Robotics in Construction (ISARC 2023)

Figure 3. Graph neural networks (GNNs). (A)
GNNs map information in the graph domain to a -dimensional vector representation. (B) A 2layer GNN example with a message-passing mechanism, where x indicates representation of node at layer . Each node has its own computational graph, through which its representation is updated by transformation and aggregation of its neighborhood's representation. Here we only show nodes A and F. The transformed representation of the neighbor, self, and the connecting edge in layer − 1 constitute a message (MSG) in layer . Every node aggregates (AGG) the messages it receives, using a permutation-invariant aggregation function (e.g., sum or average), and updates its representation. geometries, semantics, and some spatial and topological relationships. They can be used to filter out irrelevant information, group related data, and identify key components. For instance, graphs were used to represent the relationships between IFC instances to enable topological querying, with semantic information being incorporated as the node and edge weights [14]. However, using these representations for learning-based quality control inference models has not been previously explored.

GNN in AEC domain
GNNs have seen limited use in the AEC domain but are gaining attention due to their potential applications [15]. For example, GNNs were used in conjunction with spatial vector data to classify patterns among groups of buildings for urban planning [16]. In architectural design, GNNs were used for the automated generation of floor plans that follow specific space planning rules [17]. In construction, these techniques were utilized to identify the most time- efficient construction sequence and to improve scheduling productivity and accuracy [18]. Finally, the automatic classification of room types [19] and the automation of the classification of BIM objects into different Industry Foundation Classes (IFC) [20] categories are two of the few examples of the use of BIM and GNNs.

Methods
In this section, we explain how we convert the ad-BIM IFC into a graph structure that can be used for automated machine learning-based quality control assessments. In Section 3.2, we describe the GNN model used for semantic-aware quality status classification of building elements.

Converting BIM to a graph
In this work, we consider the elements in ad-BIM as nodes and their associations with other elements as the edges of the graph. Two nodes are connected through an edge if they are topologically or spatially related. Incorporating temporal relationships is not considered in this paper. The topological and spatial relationships between BIM objects are extracted from ad-BIM in IFC format. The topological relationships are directly extractable from the hierarchical inheritance associations of objects in the IFC schema. Spatial relationships are determined by quantifying the distance between the various BIM objects' axis-aligned bounding boxes (AABB). We identified three types of topological relationships ( Figure 4): (1) inclusion; (2) opening; (3) connection; and two types of spatial relationships: (1) proximity; (2) interference.
An inclusion relationship refers to the relationship between a container element (e.g., wall) and a filler element (e.g., door). For example, a wall instance of the IfcWall 40th International Symposium on Automation and Robotics in Construction (ISARC 2023) class (e.g., IfcWallStandard) would be connected to an If-cOpeningElement through an IfcRelVoidsElement. A door in the same wall would be an instance of the IfcDoor class and would be connected to the same IfcOpeningElement through an IfcRelFillsElement. An opening relationship describes the association between a void and its container, which may or may not be filled by a filler element. A connection relationship is defined using IfcRelConnectsElements, which describes the elements' connectivity with a connection geometry (such as a point, curve, or surface). For example, walls A and B in Figure 4 would be connected through a surface and not through any other element.
Our proposed method for identifying spatial relationships between BIM objects calculates a distance matrix based on the minimum distance between the objects' AABB. The distance between two AABBs can be calculated using the coordinates of their bottom-left and topright corners in 3D space. We then parametrize the proximity and interference relationships between objects. In the equation below, is the distance between the two BIM objects, , are the proximity thresholds and is the interference threshold. Finally, semantics such as element's type and floor are extracted from the ad-BIM to build the ad-BIM graph = ( , ), where and refer to the set of its nodes and edges, respectively. Nodes in contain features extracted from ad-BIM, graph structure, and on-site data, while edges in have no attributes.

Architecture and training settings
Graph Attention Network (GAT) [5] is a GNN architecture that is built by stacking graph attention layers (GAT convolution). GAT convolutions use attention mechanisms to implicitly specify weights to different nodes in a neighborhood, indicating their importance. They may consist of multiple "heads", each with its own set of parameters. These heads attend to different aspects of the graph, allowing the model to learn multiple representations of the graph simultaneously. Mathematically, the attention coefficient between node and node is computed as: where , is the attention coefficient between node and , is a learnable parameter vector, ℎ and ℎ are the node representations for nodes and , ∥ is the concatenation operator, N is the set of all the neighbors of node , and is a learnable weight matrix.
Finally, the new hidden representation of node (ℎ ′ ) in a GAT convolution with heads is calculated as: As depicted in Figure 5, our model has a sequential architecture and consists of two multi-headed GAT convolutions and a fully connected layer with ReLU activations. We apply dropout before each layer and to the normalized attention coefficients in GAT convolutions. The output is a four-dimensional array that is normalized using a logsoftmax function. The training is performed following a transductive learning approach. In this approach, the model is trained on training labels while accessing the entire graph structure and node features. We used the negative log-likelihood as the loss, Adam optimizer with a learning rate of 5 −3, and weight decay of 5 −4. The model was trained for 10K epochs with early stopping.

Baseline model
In this work, we use an ensembled baseline model ( − ) to predict element-wise quality status. The baseline model is trained on features learned on point cloud data [21] and engineered features that relate to scanvs-BIM coverage as discussed in Section 4.1.3. This baseline model − was trained on a large number of individual element data from multiple different construction projects yet without considering relationships between elements.

Experiments & Discussion
In our experiments, we are interested in answering the following questions:(1) How robust is our method on a dataset with a mixture of graphs with both highly imbalanced and balanced labels? (2) What are the impacts of the size of the training, validation, and test sets on the model performance? (3) How does the BIM-GNN model perform compared to the − model, which does not consider the element relationships and graph structure? (4) How do different feature types contribute to the model's expressiveness? (5) How can BIM-GNN help label unobserved or partially observed elements?

Description
The dataset used in this work contains ad-BIMs and scans captured from three institutional building projects in Europe, including two scans from a university building (UB1 and UB2), one scan from a hospital (LH), and one scan from a special school (SS) project. All scans were 40th International Symposium on Automation and Robotics in Construction (ISARC 2023) Dropout with probability .
GAT convolution with heads, output feature size, dropout , followed by activation.
Fully connected layer with output feature size, followed by activation. Figure 5. BIM-GNN Classifier architecture.
processed and registered with their ad-BIMs, such that only a subset of the elements of the ad-BIM are associated with the 3D point cloud. Then engineered and learned features for each element are computed solely based on the processed (partial) per-element scan of the on-site observations. An automatic ML-based process assigns each element a label based on the extracted features. A postprocessing review is conducted to ensure the reliability of the assigned labels. We call a processed scan an analysis.

Statistics
As summarized in Table 1, each analysis was converted to a graph with an average node count of 3103 and an average degree of 50. The edge counts varied between tens to hundreds of thousands. However, the relationships extracted from the ad-BIM turned out to be dominated by Spatial relationships across all analyses. Figure 6 depicts the distribution of ground truth quality status labels for nodes in each graph. Imbalanced label distribution across all scans was observed where No Data accounts for a large proportion of the elements mainly due to data incompleteness in real-world projects.

Features
Nodes in the created ad-BIM graphs contain three features: BIM-based, graph-based, and scan-based. BIMbased features include semantics such as element type and floor. Graph-based features capture information about a  node's local neighborhood and include node degree, eigenvector centrality, and clustering coefficient. The node degree is the number of edges incident to that node. Eigenvector centrality measures the importance of a node in a graph. The clustering coefficient measures how the node's neighbors are connected to one another. Scanbased features include BaselineML and PointNet features. BaselineML is a set of hand-engineered features that quantify the degree to which the ad-BIM elements match the scans and include features such as scanned fraction, relative alignment error, etc. PointNet features are extracted from the hidden representation of the last two layers of a pre-trained Pointnet [21] model.

BIM-GNN Classifier performance
In our experiments, we used nine randomized training/validation/test splits. We started our evaluations with a split of 60% training, 20% validation, and 20% test, considering all features defined in Section 4.1.3. We denote this combination of features and data split as the dataset. We chose the F1-Score with weighted average as the metric for evaluation since it is suitable for evaluating both balanced and imbalanced datasets. Table 2 summarizes the performance of our model on the test sets 40th International Symposium on Automation and Robotics in Construction (ISARC 2023) across 9 runs based on the metric above. The results suggest that our model outperforms the baseline significantly. The average F1-Score of − is 55.34% with a standard deviation of 20.26%, while the average F1-Score of our model is 77.99% with a standard deviation of 7.37%. The average improvement is 22% with a lower variance. The improvement is more significant when the dataset is imbalanced. For example, the average F1-Score of − on the UB2 graph is 22.26%, while the average F1-Score of our model is 88.44%. Figure 9 depicts a low-dimensional representation of node logits pre-and post-training using t-distributed stochastic neighbor embedding (t-sne). T-sne maps highdimensional data to a lower-dimensional space while preserving its structure. Figure 9 reveals separate clusters for each label. While our model can differentiate between missing, no data, and the other two classes, it struggles to differentiate between deviated and verified in certain cases. Further investigation is needed as mistakenly detecting deviated as verified may lead to unnoticed quality issues, although this is partly due to labeling subjectivity.

Influence of training data percentage
The dataset allows the model to access 60% of the labels in the training set directly and 20% in validation indirectly, meaning if we have access to 80% of the labels, node features, and ad-BIM, we could outperform the baseline. However, to address the practicality of such a high percentage of available labels, we decreased the percentage of available labels (training 3: validation 1) and repeated the experiments. Results in Figure 7 show that even with 10% of the node labels (7% training), our proposed method achieved considerably higher F1-Score than the classical approach.

Ablation study
An ablation study is performed to investigate the importance of different features. We retrained our model by removing one feature or feature set at a time from the dataset, namely, PointNet ( ), type ( ), and BaselineML ( ) features, plus graph-based ( ℎ) and scan-based ( ) Figure 7. Impact of labeled data availability percentage during training on F1-Score and comparison with − .
feature sets ( Table 3). The results, shown in Figure 8, suggest that the PointNet features (learned scan-based features) do not positively contribute to the model's expressiveness, while hand-engineered BaselineML features seem to be more essential. Removing the type and the graph-based features can worsen the model's performance while removing the scan-based features has the most negative impact. The results indicate that all features, except for PointNet, contribute to the superior performance of our model. Nonetheless, we retained the PointNet features to ensure smoother performance, as a significant increase in F1-Score variance across various analyses is observed when excluded ( ). Figure 8. Impact of feature ablation on F1-score and comparison with − .
More interestingly, comparing − and results suggests that our model can still outperform the baseline even without the scan-based features. This essentially means that given the graph structure and the labels for some elements (60% training), we can better predict the quality status of elements even if they are unobserved. This is a significant improvement on the baseline, which views the elements in isolation and solely relies 40th International Symposium on Automation and Robotics in Construction (ISARC 2023) on scan-based features. Our results suggest that incorporating semantics can improve element-wise quality status classification performance despite the partial observability of elements as an essential building block in construction quality assessment applications.

Conclusion and future work
In this work, we demonstrated the utility of exploiting the semantic (i.e., topological and spatial) relationships encoded within the ad-BIM graph to improve automated construction quality assessment despite partial observability of elements. To enhance our approach, we plan to explore additional graph learning algorithms and incorporate more relationship types to better capture the relationships between mechanical, electrical, and plumbing (MEP) elements. Additionally, we aim to add relationship types as edge features and further evaluate the generalizability of our method by applying it to more projects. We also hope to leverage the temporal relationships in 4D-BIM to predict element labels based on time-and sequencedependent contexts.