Date and time: 24/02/2022 at 15:00
Topic: Integrating Machine Learning and Computational Physics to Assess Crack Pattern Similarity in Masonry Buildings
Track: Structural Engineering
Description: Cracks in masonry structures are a cause for concern as they signal a potential lack of functionality and/or aesthetics. It thus becomes important to identify the cause of damage in order to mitigate it and to prevent its occurrence in the future. Similarities in crack patterns may correlate to similarities in the damage cause. Currently, the assessment of similarities in crack patterns and their corresponding damage causes is done by masonry experts and structural engineers. This process is often expensive and subjective. The use of a Convolutional Neural Network (CNN) may offer an alternate robust and dependable means to automate the assessment of masonry crack patterns by processing their images.
The main research goal of this MSc thesis is to answer how accurately can the CNN -- fitted to data generated from finite element models -- estimate masonry crack pattern similarities. To develop a neural network that can perform such an automated assessment of masonry crack patterns with a high degree of accuracy, a large number of crack patterns with similarity ratings given by human experts are required. This data is collected in increasing complexity, first from a statistics-based approach by generating synthetic crack patterns from Markov walks. This is followed by a computational physics-based approach, such as the Finite Element Method (FEM), that generates crack patterns on 2D masonry façades subjected to differential settlements and out-of-plane loads. Finally, real-world data is also collected. This data is used to fit and test a convolutional neural network developed by Kleijn (Kleijn, 2022). Continuing along the previous line of research done at TNO (where 12 crack patterns were chosen and developed using the statistics-based approach), this thesis focuses on developing parametric finite element models of 8 out of these 12 Pattern IDs. Additionally, real-world images are also collected from Gouda in The Netherlands. This data is then used to form crack pattern image pairs that can be assessed for their similarities by 28 raters using three similarity label categories: crack pattern similarity label, damage severity label, and the overall similarity label. Using these labels, the raters assessed 2587 image pairs generated from the statistics-based approach, 500 image pairs from the computational physics-based approach, and 50 image pairs from the combination of images from the statistics-based approach, computational physics-based approach, and the real-world cases.
An inter-rater agreement analysis is performed on the similarity assessments using the Krippendorff’s alpha measure. Additionally, the agreement of each rater with a chosen standard rater is studied using the Lin’s Concordance Correlation Coeffcient (CCC). Using the Lin’s CCC, intra-rater agreement is also assessed for the standard rater to see how consistent a rater is with their own annotations. These labelled image pairs are then used to fit and test the regression neural network to evaluate its accuracy in predicting the similarity labels. The neural network is also fitted to and tested with various combinations of labelled data to study its generalisability.
It is found that in all three sets of data, the Krippendorff’s alpha is less than 0.80 for all the labels, which indicates an insufficient agreement among the raters. It is also seen that, in general, agreement among the raters increases with their experience level, i.e. the descending order of agreement within rater group is: industry experts, PhD students, and MSc students. Studying the Lin’s CCC of each rater’s performance compared to that of the standard rater helps to choose the raters who can be considered as reliable as the standard rater. Additionally, the intra-rater agreement analysis of the chosen standard rater shows that the highest self-consistency (agreement) is achieved for the crack pattern similarity label, followed by the overall similarity label and finally the damage severity label, with corresponding Lin's CCC values of 0.96, 0.86 and 0.72, respectively.
The neural network is tasked to predict the similarity level in each similarity rating for each image pair in the test sample. The ground truth of this neural network is established by averaging the similarity ratings given to each image pair by multiple raters. It is found that the neural network is able to achieve sufficiently high degree of accuracy when fitted to and tested with all the image pairs generated from the computational physics-based approach. The crack pattern similarity label, the damage severity label, and the overall similarity label achieves an accuracy of 87%, 82%, and 69%, respectively. However, the generalisability experiments on the neural network that consist of predicting the similarity of a type of crack pattern image pair that is not included in the fitting data set, shows very poor performance with respect to the prediction accuracy of the similarity labels. When the neural network attempts to predict the similarity of Pattern ID or a façade geometry that it did not see in the fitting procedure, it predicts all the three labels with an accuracy that varies from 40% to 50%. Additionally, the neural network is also fitted to images generated from the computational physics-based approach and then tested with a pool of image pairs generated from the statistics-based approach, computational physics-based approach, and real-world images. The average accuracy with which the three similarity labels are predicted is even lower, lying between 25% and 40%.
This MSc thesis concludes that the neural network fitted to data generated from the computational physics-based approach and assessed by all the raters is able to predict the crack pattern similarity label, the damage severity label and the overall similarity label with sufficiently high degrees of accuracy. However, the generalisability experiments on the neural network shows very poor results. This indicates that in order to achieve a greater prediction accuracy, the neural network may need to be fitted to a considerably larger sample of crack patterns that covers all of the relevant situations. Furthermore, the substantial inter-rater variability in labelling of crack pattern image pairs suggests that even an ideal neural network architecture may not be able to overcome the inconsistencies in the fitting data.