Analysis of craquelure patterns in historical paintings using image processing along with Neural Network algorithms

Recent advances in technology have brought major breakthroughs in deep learning techniques. In this work, the author will elaborate on such techniques for output data of image processing performed on craquelure patterns in historical paintings. Historical painted objects, especially panel paintings, with their long environmental history, exhibit complex crack patterns called craquelures. These are cracks in paintings that can be referred to as ‘edge fractures’ since they are formed from the free surface. The analysis has been conducted on the set of selected craquelure patterns to which a recent deep learning method, i.e. Neural Networks algorithm is implemented and the results of such a self-learning process are discussed.


INTRODUCTION
Historical painted objects, with their long environmental history, exhibit a complex network of cracks called craquelures. Over the last decades, craquelure patterns in paintings have been intensively studied. Currently, the craquelures research is becoming one of the most captivating and intriguing challenges at the borderline between humanities represented by art history and conservation on one hand and natural sciences and engineering on the other. Motivation for research on craquelures are twofold. On one hand, this research are driven by academic interest how craquelures develop and why one can see very different and distinguishable craquelure patterns. On the other, this research aims at decisively contributing to the development of new evidence-based environmental specifications for paintings which are the most precious and vulnerable heritage asset in museums and historic buildings worldwide. In spite of recent advances in understanding the physical response of historic materials to changing environmental conditions, lack of material properties of historical materials and limited information on the effect of the damage accumulated in paintings in the past on their susceptibility to environmental variations have been perceived as two major deficiencies of the research. The deficiencies are raised as main arguments against relaxing climate control in museums which would make possible movement to green museums, that is reducing energy consumption and carbon emissions. The craquelure studies involve different approaches, starting from the use of digital technology to experimental measurements of prepared samples based on historical paint formulas. For instance, S. Bucklow compared different methods for craquelure description [1]. Other works involve crack channeling mechanisms [2], desiccation crack patterns [3], craquelure formation mechanisms [4,5], development of cracks [6], Neural Networks for the description of craquelures as mathematical graphs [7,8], and more. This leads to uncovering the physical processes, paintings history, geographic origin or characteristics of cracks, such as thickness and their depths, or categorization of craquelure patterns. Some features are also used for the authenticity of paintings. The majority of the works focusing on craquelures involve image processing to identify the cracks. The development, recent advances in technology as well as available software allow us to take heritage science to a new level. In this work, image processing is used to extract craquelure patterns. Moreover, the progress of machine learning is put into use. Cracks to be observed on the surfaces of different materials, in particular on historical paintings, are ideal subjects for machine learning algorithms due to the big amount of data that the system can learn from. The areas where automated data analysis can be implemented in the domain of heritage science are vast. Craquelure recognition, identification, and categorization are one of them. To date, research shows that different origin of the paintings like Italian, Flemish, Dutch, and French, from the 14th -18th centuries manifest their diversity in the patterns of cracks. Therefore, the primary phase of this research involving image processing, deep learning, and cracks characterization, is narrowed down to the Dutch Golden Age only. Taking into account more than one geographic origin and different times would be ideal as various sources and their differences could be compared. Such extensive studies are planned for future work. However, to do so, good quality sources of images and a longer period time are required. Here, we focus on the creation, improvement, and optimization of the algorithms for craquelures. Nevertheless, the results will be highly valuable for any forthcoming works as well as for the heritage science community. Another challenge encountered during this research is the image size and resolution which translate to computational time large data sets. Additionally, large data lowers the performance of the device due to memory and processor consumption. Hence, only small fragments of the whole painting were examined. The inclusion of smaller samples also results from the fact that we focus on one type of pigment alone, namely white lead.

An algorithm for cracks extraction
The first step of the procedure is to extract the craquelure pattern from the selected fragments of the painting. To do so, simple image processing operators have been implemented in an appropriately established order. The crack recognition is performed based on the fact that crack lines emerge with the color difference to the background paint on which they occur. It should be noted that for dark backgrounds, it is more complex to evoke the cracks line. In most cases, they are just unnoticeable both with the use of a machine and with a human eye. Only zooming in the area of interest, ideally with the use of a microscope, can give a picture in which all the details are revealed. The bright background, on the other hand, constitutes a more convenient and thus more practical case. Numerous examples of historical paintings indicate a higher contrast between the crack lines and homogenous background cracks in the regions of light colors, as in this work case. The algorithm has been developed using the Wolfram Mathematica software. Several available functions have been applied, from which a couple of parameters were generated. The values of these parameters were chosen for each fragment of the painting from the created database (Table 1) depending on contrast, quality of an image, and density of craquelures. The procedure of crack extraction includes the following stages: • Stage 1: An identification of the cracks. To distinguish the line cracks, the selection of morphological operators has been implemented. Foremost, a grayscale is applied to operate on lesser data for each pixel. Secondly, the image adjustment is introduced, from which the first parameter emerges. However, for all of the painting fragments considered in this work, the automatic adjustment parameters were used, being the contrast, brightness, and gamma correction. They are set by the linearization of the distribution function of the combined values of pixel intensiveness. Next, the so-called gradient filter operator was adopted. This operator convolutes the first derivative of a Gaussian with radius r being twice the standard derivation in order to detect areas of a rapid change in data image. The radius is fixed and set to be r=0.5 for all samples. Such a choice gives the sharpest crack lines while gradient filtering. Hereafter, more complex processes were used. From there, an initial step was to apply a ridge filter that measures a principle curvature to find and extract ridges in an image. This is carried out by a local approximation of the pixel values with a polynomial of the second order, known as the Hessian matrix. Specifically, pixels with a greater major eigenvalue than the minor of the Hessian matrix are selected and so the matrix elements were rearranged giving the brightest values of the pixels at the found ridge line locations. At this point, a second parameter occurred subjected to the ridge filter r F . In our examples of the paintings, it oscillates from r F =2 to 7. The lower r F , the fewer details were recognized, and thus fewer crack lines were detected. The r F =2 is valuable for the the craquelure pattern where all the lines have roughly the same intensity level, which means the primary and secondary cracks equalized or no secondary cracks materialized. For r F <2, the primary cracks are lost. On the contrary, higher r F is effective for craquelure patterns which exhibit the variation in the intensity of cracks throughout the image fragment. For the majority of the samples chosen for the research of this work, the value of the parameter was set r F =6. Furthermore, one more time image adjustment was performed and on top of it the binarization operator with a threshold parameter t above which all pixels took a value 1 and the rest remained 0. The range of t value fluctuated from 0.015 to 0.1 for the chosen painting fragments, where the low value brings more contours of cracks as an outcome in the final image, whereas the large t generates less of them. This step was one of the preprocessing functions allowing for feature and Having the binarized image of identified contours, the operator responsible for selecting specific components was implemented to eliminate any small details and objectionable parts, leaving only the desired lines. The operator selecting specific components was based on the provided criteria c. Parameter c took the values from 100 to as much as 500 that corresponded to many small components that were removed from a certain fragment of the painting. The number depends mostly on the frequency of undesired details and the size of the image itself. The result holds thick lines. Therefore, a trimming function was applied in order to trim them down. Lastly, the final pruning operation eliminates outer parts of branches which were not vital to the overall craquelure image output. The adjusted and prepared image has been achieved, which result is a specific sequence of pixels. Table 1 summarizes the collected operators with their parameters if any are attributed. Table 1. Operators and their attributed parameters of a crack identification step for the crack extraction algorithm.
• Stage 2: Transformation of detected cracks into a defined lines family The second major step of the crack extraction algorithm establishes the transformation of detected cracks from the first step into a set of defined lines. This is carried out by detecting the endpoints of each recognized line in a modified and prepared image. That being said, a morphological transformation is applied, consisting of several operations such as convolution of erosion with dilatation, top-hat closing, and skeletonization. These functions find endpoints and branches in skeletonized objects. Certainly, not all the pixels resulting from the first step are needed for line reconstruction. Indeed, only the key pixels are looked for, using the operators mentioned above and marked as important. These are nodes of three types: when the lines cross each other i.e. junctions of four-line cracks, crossing of the three lines i.e. "Y" type, and the lines with dead ends that have no further connections. Once all such pixels are found, their adequate connections need to be defined. This requires a recognition function that will indicate which nodes are linked together. This is accomplished by assigning an index to each node. The junction type enables the algorithm to accurately establish the edges. Subtracting the pixels that correspond to the nodes from the image, generates a set of lines. These are equivalents to crack edges. Given indices of pixels enable to keep a record of the numbering of the lines generated as well as measuring their basic characteristics, like distance or angles. It is worth mentioning, that the algorithm will not consider the small round, circle-shaped cracks. It recognizes them as closed loops rather than a category for lines. The last phase is the simplification of the procedure. As a consequence of the operations that have been used, too many nodes are found. Indeed, some of the crack lines that a human eye would consider as one path, the algorithm separates into many small parts creating an unnecessarily large amount of lines. Therefore, the distance between the major points is measured and the middle points at that distance and its close neighborhood are calculated. If the points are close enough to a straight line connecting the major points, all middle pixels can be ignored. The process is repeated till no neighboring points are found. The result of this step provides the endpoints of the edges and edges of the cracks themselves. The process of both steps is shown in Figure 1, on the example of a fragment from the painting "Woman in Blue Reading a Letter" by Johannes Vermeer.

Selection of historical paintings fragments
White pigment is one of the most vital colors in art. For years it has been used as the background in paintings, as well as to highlight particular parts of art giving them warm tones. Despite its toxic properties, white lead has been used continuously throughout history. The effects of white lead were appreciated by such artists as Vermeer or later Van Gogh. In this paper, profound research has been performed on paintings for which white lead was used. To create a database, accessible electronic sources have been explored. Only a few paintings were taken into account due to the complexity with crack identification as mentioned above. In the majority, lack of good quality images was a problem but also insufficiency in white parts of paintings exclusively or mixed with light color additions that change neither chemical nor mechanical behavior of white lead pigment. The quality of a painting is essential for the cracks extraction algorithm to be well-defined. One of the considered paintings with white lead is that of Fabian and Fortunato [Fabian & Fortunato 2010]. The Author of this work looked into seventeenth-century paintings towards the determination of lead pigment origin. The list of sampled paintings delivered in that work consists of a large amount of data including such information as author, the title of art, collection, location or region where each sample was painted, and its chemical composition. Still, many of them are inoperative for the goal of this work since no white lead alone had been found but rather the mixture with other pigments which might change the craquelure patterns. In Figure 2a the difference in craquelure structures can be seen in the example of "The Martyrdom of Saint Sebastian" by Battistello Caraccialo. Such variabilities are especially visible in the sweeping areas from light to dark color shades. In the chosen example, the denser net of cracks appears in dark regions, while the less packed in light ones. The contrary situation can be noticed in the painting "Sara Serena Rubens" by Peter Paul Rubens in Figure 2b in which very clear horizontal cracks (primary cracks) go through gray, white and, then skin tones. The same holds with the net of cracks in between the primary cracks. Indeed, variation in pigmentation does not influence the structure of the craquelure arrangement. It suggests that the artist used white lead also in gray and skin tone region. It is worth mentioning that the crack extraction algorithm created for the purpose of this work has its major disadvantage. Namely, the process of craquelure extraction does not distinguish cracks itself from the lines constituting the changing of colors in paintings. It is especially evident in Fig. 2b in which the line of the cheek is also recognized as a crack. Therefore, the database created in this work considers fragments of paintings in which no change of color appears. Consequently, the well-defined craquelure pattern derivation is assured. A complete database can be found in Table 1 sample of fragments cropped out from the chosen painting is shown in Figure 2. The complete database consists of 21 fragments of four different paintings.

Characterization of extracted cracks
As mentioned earlier, the created algorithm makes the points and edges recognition possible. We obtain a defined family of lines spread over the region of the same image size as the initially chosen fragment of the painting. That being said, each fragment has certain ranges of axis x and y. The statistical analysis constitutes specifying a couple of line characteristics such as their lengths, the total number of lines per unit square, and the deviation of lines from the vertical line. From these three types of grouping, one is chosen to categorize the lines that represent the craquelure pattern. The selected categorization is the one expressing parallel or perpendicular lines depending on the angle α they create with a respective direction. For sets of lines with a specific α that changes every 10 degrees, the length of the mean cracks is calculated as well as their amount. These calculations are followed up by the machine learning algorithm, to be specific the Neural Networks. The self-learning procedures of this algorithm are fed with the obtained results from the manual crack identification.

Determination of craquelure pattern distribution using Neural Networks
Build-in functions of the Wolfram Mathematica software are used for Neural Networks implementation. The adopted technique for this work is to learn the distribution of the craquelure patterns for each fragment of the paintings selected and listed in Table 1. The Neural Networks algorithm produces different results every time used, even with the same input data. The reason lays in the learning processes of the machine learning algorithm in which multiple learning paths are possible. Nevertheless, all output data are correct and should not be neglected. To verify the right direction, we take ten learned distributions with the amount of the points equal to the number of the endpoints found manually in the earlier section for the same fragment of extracted craquelure pattern. The results are later put into the final operation that defines the order of the distribution. For all ten cases, we obtain the same order number. Hence, the representation of the primary distribution is taken in the following calculations. Training distribution on the data given as a list plot of points provides a possibility to generate a similar craquelure pattern. Nonetheless, the new pattern is artificial, by means does not have a real equivalent in historical paintings, in any case not necessary. The algorithm used ends up with a smooth kernel distribution in which the probability density for value x is given by where k(x) is kernel and h bandwidth parameter. Here, the Gaussian kernel is used. The use of the method based on kernel density estimation calculates the bandwidth for each sample. To validate the correctness of the algorithm, the source data is divided into two parts (training data and test data) and used as a benchmark. Data division is repeated multiple times with different distributions of the source data and provides similar results. The obtained distributed data of points that represent the craquelure pattern is analyzed by looking at the histograms of each axis. The histograms are simply projections of point distribution to each axis. One can notice the periodicity of the signal obtained. The ranges of histogram bars are diverse due to points scattered throughout the plot. In Figure 3, the visualization of histograms of each axis and the found distribution of the cracks network is shown.

Defining an order parameter of the recognized craquelure pattern
The constructed model for the craquelure network reconstruction and its points distribution representation in a twodimension space presented in section 2.3 led to the reasoning of how to categorize the results of different painting fragments. For this purpose, a form of the order parameter is determined. The specific methodology has been applied. That is to say, the mathematical construction of mesh called a Voronoi diagram is used as a parameter identifying the degree of the arrangement order of crack lines in a given region. The Voronoi diagram creates a convex polygon, the socalled Voronoi cell, for each point in the plane. Mathematically, for the set of points P in space , the Voronoi cell corresponding to each point is defined as . In the area , the edges are nearest to than to any other point in : For the configuration of P points, the group of Voronoi cells are produced. The created polynomials never overlap. In the case of the points distribution that is highly nonuniform, the Voronoi cells significantly differ in shape and size, whereas for the arrangement close to even, the cells are nearly the same. In the event of the perfectly uniform configuration, the cells are all identical to the shape of a honeycomb. Figure 4 shows the Voronoi mesh for the set of points which represent the craquelure pattern of the case fragment. Next, the Voronoi iteration process, known as Lloyd's algorithm, has been adopted to obtain an equal arrangement of the initial point configuration. The goal is to calculate the number of iterations needed to reach the composition of the points as near to an ideal uniformity as possible. The iterations number sets the degree of order in the system. The higher iteration, the higher degree of disorder, while the lower iteration number, the arrangement of points is more regular, evenly spaced over the space of the painting fragment. The input for the iteration algorithm is the configuration of the initial points obtained from calculations from the first part of this section i.e. the extraction of a craquelure pattern along with its point distribution that constitutes its representation, such as a set of points scattered in the plane of specific dimensions. Next, the Voronoi mesh is implemented for this point distribution. The third step includes finding the centroid in each created cell of a Voronoi diagram. A new points formation P I is generated. Here, the first iterations end. In the second iteration, again the Voronoi mesh is executed for the new points network and the new centroids are found. The procedure is repeated till the configuration of the points is obtained, as evenly spread as possible. Hence, the point configuration corresponds to an array of point sets . The number of iterations N constitutes an order parameter that reveals the degree of complexity for a given set of points.

Human controlled calculations
In this section, the results of craquelure patterns of the given painting fragments are shown, in which the white lead paints is used. Four selected paintings are presented in Table 1. From each painting as many as possible fragments were cropped. Thus, from the "Woman in Blue Reading a Letter" by Johannes Vermeer, 13 fragments were extracted due to the wide area of bright parts in this particular painting. In the rest of the chosen paintings, two or three fragments were considered. Limitation in these cases arises from insufficient regions with white lead. For all fragments, both the high and low qualities were image-processed with the two steps algorithm described in section 2.1. The automatic image adjustment allowed to equalize the quality of fragments that were in worse condition. First operators in the algorithm methodology, such as grayscale, image adjustment, and a gradient filter, allow for separating the dataset of the painting fragments into two categories. The first category, consisting of fifteen cases, has clear and visible cracks that easily can be distinguished from the background. For these, the following set of parameters was employed: r F = 5, t = 0.094, and c = 300. One can notice the precise numbers especially for parameter c, in which the third decimal place is also influential. The choice of such a parametrization is based on the trial and error method till the best match of the extracted cracks with a real object is obtained. The second category includes the examples in which cracks are less evident. For these samples, different parametrization was applied: r F = 5, t = 0.07, and c = 500. Thus, a group of examples with the extracted cracks presented as the black contours on the white background is obtained. At this point, the next part of the methodology provided in section 2.1 is carried out. This part focuses on the craquelure pattern of derived sample image interpretation from which the defined lines are determined. For each image that represents a particular painting fragment, lines with its endpoints are found in the way that the structures of cracks of the real object are still preserved. Indubitably, it may be affirmed that all compositions of lines and points of the selected painting fragments are homogeneous. This fact is also backed up by the characteristics and statistics of lines and points of each fragment. Indeed, the results obtained from manual calculations that can be found in the first three columns of Table 3, are characterized by similar features. The number of lines for which the angle between them and the vertical line is =45 0 , is roughly two times lesser than the total number of lines. This indicates the isotropicity of the pattern, i.e. no specified direction of cracks. With such results, there are many possibilities of the composition of lines, such as square patterns or mud type cracks. The statistical calculations were also performed for other angles. Additionally, an analysis of a different mock-up was carried out. This work is described in paper submitted to be published [Abdollahzadeh Jamalabadi et al. 2021]. That is to say, the craquelure pattern on gesso layer on a wooden panel was experimentally prepared. The cracks lines that emerged on gesso exhibited an evident distinction of lines along the longitudinal wood direction which appeared to be longer than the ones across that were significantly shorter. Certainly, for an angle =10 0 from the longitudinal direction, 75% of lines lied inwardly. The percentage raised to 91% for already 20 0 . Only a small part of lines across the grain appeared. On the contrary, the fragments considered in the research of this work, being the historical paintings in which the white lead pigment was used, have no specified orientation nor length. Calculated mean lengths of the lines of each fragment are comparable with respect to the size of the cut fragments. Few exceptions appear in the data, for example, the mean length of the last three samples D1-D3 from the same painting is significantly smaller. The total number of lines is also higher which translates into more line intersections. Moreover, fragments D1-D3 as the only one from chosen painting selection is on wood panel. Hence, the main cracks -vertical lines -that go through wide areas of paintings are the response of wood substrate strain change. The painting fragment numbers, placed in first column of Table 3, correspond to the historical paintings listed in Table 1. Each of the paintings has a capital letter assigned to it, whereas the fragments cropped out from them are numbered.

Machine Learning computation
One of the major elements of this work is the use of Neural Networks series of algorithms. For each set of points that represent both the endpoints and lines of craquelure patterns (extended input), a particular computation that learns the distribution of points based on deep machine learning is implemented. The method is self-training based on provided point distribution. The learn distribution process includes such indicators as training examples used and best-chosen methods. Training examples vary depending on the number of points given, see Table 3. Regarding the methods, Contingency Table and Kernel Density Estimation were employed. The number of points that the algorithm returns is set to be the amount of the endpoints of the extracted lines. This number can be found in column 3 of Table 3. It should be noticed that the endpoint amount will not necessarily be double the line quantity since many of them are the connections of lines. In this sense, one point can be an endpoint of one, two, or three lines -more than three is a very rare case in general, and it does not occur in this study. The results collected for all considered painting fragments show the similarity in their homogeneousness, which was already evident in the characteristics of the extracted lines analysis. Therefore, the output of Neural Networks confirms the comparable behavior of craquelure patterns in the painting fragments where white lead was used. However, the research of this work includes only four paintings. Hence, extended research comprising of a broad database of paintings is required. Nevertheless, the paintings in this work exhibiting their uniformity are compared with artificially generated craquelure patterns. Thereby, a parallel analysis has been performed revealing the distinctions between the lead white fragments and generated crack lines of different pattern categories. That being said, three imitations of the obtained cracks were created. Two of them are the horizontal crack lines with no secondary cracks in between and that do not cross themselves. The third one presents the circles with a common center. Such a pattern is usually caused by the mechanical impact, points of tension, or a different type of support structure. All generated and three chosen crack lines extracted from the selected historical paintings (fragments A5, B1, and C1 from Table 3) are shown in Figure 5.  Table 3, on the right side: artificially generated cracks described above.
Next, Lloyd's algorithm is applied and the number of iterations needed to obtain the evenly spaced sets of points calculated. The indicator for the uniform distribution is the area of each cell in the Voronoi mesh. The more equal the areas of individual cells in one fragment, the higher will be the homogeneousness of the system. The areas can be visualized by data histograms. The percentage of k iteration is computed in relation to the input point distribution that is iteration number zero. Therefore, the question arises: after how many iterations do the areas of each cell equalize. An infinite number of iterations are needed to equalize the cells almost at the same level. Nevertheless, the Lloyd algorithm does not lead to a perfectly distributed space of points i.e. the honeycomb pattern. As far as all studied cases of historical paintings are concerned, the results are roughly the same. For more than 200 iterations, the percent of the difference in areas, from the biggest to smallest, in two sets that correspond to no iteration case and iteration number k, barely changes or does not change at all. On that account, for 200 iterations, the percentage is 7.3%, whereas for 1500 iterations it is 7.2%. This indicates the vertical asymptote at around 7% . The drop of the iteration number results in a percentage increase. At the level of the fifth iteration, the areas equalize so that the percentage is dropping to 25 -60%. This relation can be seen in Figure 6, in which a dependency between the number of iteration and the percentage of the cell area is presented. It can be noticed, that the knee of the curve and exponential function vary depending on the studied fragment. However, the knee points for each fragment are in the range of 10 to 12 iterations. The average is equal to 11, which is marked in Fig. 6 by the grey line. The knee point of each plot is adopted to be a threshold at which the point composition becomes homogeneous. The percentage of the area and the corresponding iteration number for each fragment are listed in Table 3. As it was mentioned earlier, the artificially generated point distribution of the specific craquelure patterns was also computed for comparison purposes. The output data for these three cases are shown in Figure 6 in blue. The Y-axis for these curves is on the right side of the graph. The behavior can be easily understood. The second iteration constitutes the threshold of homogeneity. Further iterations do not change the composition of the points and this stability is held on 70 to 75% of the obtained areas. Such a result indicates that the methodology in place is sufficient for the craquelure pattern type recognition at a certain level of credibility. Clear differentiation can be seen between the generated cracks, that are without noises, and the crack lines extracted from the studied historical paintings. One may predict that for a wider database of the fragments, the curves at the percentage-iteration dependency graph will be more spread holding the same average iteration value at the level of iteration=11. The future work will include different types of pigments along with white lead to compare various results and obtain the full picture of the methodology strategized in this work. Furthermore, the experimental works that currently are being performed will also be included in the craquelure patterns research. Experimental measurements will provide a comprehensive database of material properties and structural features of pictorial layers, in particular, different types of pigments. As of today, an extensive study is conducted for the lead white, yellow ocher, and azurite. The physical analysis of the prepared samples is carried out in the laboratory including the measurements using environmental rooms, universal testing machines for tensile properties determination, a dynamic mechanical analyzer, as well as a vacuum microbalance for measuring water vapor adsorption isotherms coupled to a vessel for measuring moisture-related swelling. Such parameters as Young's Modulus and strain at break are calculated for the painting samples.  Table 3. The blue lines correspond to the three artificially generated craquelure patterns from the right part of Figure 5, marked A, B, and C. The diamond-shaped points are the knee of each curve. The vertical grey line shows an average iteration number for historical paintings.  A1  1727x960  321  64  169  22115  44  11  A2  1936x993  314  61  158  21829  36  11  A3  2218x1070  577  54  284  34143  41  11  A4  1895x976  217  82  131  19954  24  11  A5  1880x1046  357  62  182  23124  41  11