Enhanced Fuzzy-Based Local Information Algorithm for Sonar Image Segmentation

The recent boost in undersea operations has led to the development of high-resolution sonar systems mounted on autonomous vehicles. These vehicles are used to scan the seafloor in search of different objects such as sunken ships, archaeological sites, and submerged mines. An important part of the detection operation is the segmentation of sonar images, where the object’s highlight and shadow are distinguished from the seabed background. In this paper, we focus on the automatic segmentation of sonar images. We present our enhanced fuzzy-based with Kernel metric (EnFK) algorithm for the segmentation of sonar images which, in an attempt to improve segmentation accuracy, introduces two new fuzzy terms of local spatial and statistical information. Our algorithm includes a preliminary de-noising algorithm which, together with the original image, feeds into the segmentation procedure to avoid trapping to local minima and to improve convergence. The result is a segmentation procedure that specifically suits the intensity inhomogeneity and the complex seabed texture of sonar images. We tested our approach using simulated images, real sonar images, and sonar images that were created in two different sea experiments, using multibeam sonar and synthetic aperture sonar. The results show accurate segmentation performance that is far beyond the state-of-the-art results.


I. INTRODUCTION
H IGH-RESOLUTION imagery of the seabed is mostly provided by sonar systems for the purpose of object detection. The relatively clean images produced by these technologies have increased the feasibility of automatic detection and classification (ADAC) of underwater objects [1]. ADAC is required for marine applications such as seabed archeology [2], pipeline monitoring [3] and offshore oil prospecting [4]. The process is applied onboard an autonomous underwater vehicle that independently surveys a designated area. In all of these applications, the key to successful object detection and classification is to separate the seabed background from the object's highlight and shadow regions. This separation process is referred to as image segmentation. The focus of this work is to develop a segmentation algorithm that combats the main challenge of intensity inhomogeneity, which poses difficulties in regard to sonar images. Image segmentation methods include the use of mixture [6], graph cut [7], and active contour [8]. In this work, we consider the application of sonar segmentation using the fuzzy theory. Fuzzy algorithms have been widely applied to the segmentation of optical images, and their simplicity and low complexity offer advantages to sonar segmentation onboard autonomous vehicles, where real-time analysis and low processing needs are of interest. Methods like fuzzy c-means (FCM) [9], fast generalized FCM algorithm (FGFCM) [10] and enhanced FCM (EnFCM) [11] achieve good results on natural images. However, due to the strong intensity inhomogeneity in sonar images, the results of these methods are seriously degraded [12]. Moreover, these methods are very sensitive to initialization inaccuracies and tend towards convergence to local minima. Another challenge of current fuzzy methods is the need to set the fuzzy parameters via a trial-and-error process, e.g., according to the trade-off between the original image and the filtered one, thereby limiting the robustness of the schemes to different sea conditions.
To combat the aforementioned challenges, we propose a new sonar image segmentation algorithm to be used within the ADAC scheme. Considering the need to adapt to complex seabed structures, our solution is based on fuzzy segmentation with a non-Euclidean kernel. Our goal is to achieve stable performance in different environmental conditions and for different objects' shapes. To reduce false segmentation and to improve complexity, our method combines segmentation with a new de-noising algorithm that identifies the highlight, shadow and background pixels using split-window architecture, and employs an automatic mechanism to evaluate the de-noising performance. To better deal with intensity inhomogeneity in sonar images, our de-noising solution includes a new Bayesian-based filter. To attain accurate sonar image segmentation and fast convergence, as part of the objective function of our fuzzy optimization problem, we add two new fuzzy terms, which we refer to as the local second moment and the between-cluster. Finally, we note that both our de-noising and segmentation solutions are parameter-free, thereby improving the overall robustness of the segmentation results.
This paper focuses on the segmentation part of the computer-aided detection and classification (ADAC) scheme. Being implemented after the detection algorithm and before feature extraction, segmentation is a key enabling technology towards reducing false positives in the detection scheme while accurately identifying the location of the object within the sonar image. The design of segmentation algorithms has, therefore, drawn attention in the sonar processing community. However, intensity inhomogeneity and complex seabed textures like sand ripples in sonar imagery, cause severe degradation in the performance of existing approaches. This paper is motivated by facing these knowledge gaps to produce a robust and efficient segmentation algorithm.
Two main contributions are identified in this work. The first is image de-noising, as a pre-processing step to image segmentation, and the latter is fuzzy-based image segmentation. We use different statistical models for the shadow and background regions to better smooth the spiky noise in the image, and improve segmentation accuracy by adding two terms into the objective function that utilize spatial information to reduce the effect of inhomogeneity. These improvements lead to a segmentation scheme that is robust to intensity inhomogeneity and different seabed structure and obtains the best segmentation results with less misclassified regions.
The key idea behind our method, referred to as the enhanced fuzzy-based with kernel metric (EnFK) is that, due to the intensity inhomogeneity, the noise in different areas within the sonar image should be treated separately. The flowchart of EnFK is illustrated in Fig. 2. We start with a sonar image de-noising as a pretreatment step. Both the original image and the de-noised one serve as the input to the segmentation procedure. This structure allows for increasing clusters' uniformity with clear boundaries between regions. To increase segmentation accuracy, the proposed de-noising scheme operates in parallel with the segmentation initialization scheme and performs its rough clustering. This operation is controlled through a mechanism to self-evaluate the success of the de-noising scheme. Finally, to reduce the effect of noise and increase the separation of the clustering results, we feedback the segmentation results to refine both the process of image de-noising and the fuzzy formalization. The main benefit of the proposed algorithm is its improved robustness to the non-homogeneous regions in the sonar images, as well as to the different seabed structures. This robustness comes out of the pre-setting of only a few system parameters. While this robustness comes at the cost of a more complex implementation structure, the complexity of EnFK is rather comparable with the benchmarks.
The main contributions of this paper are summarized as follows.
1) A new parameter-free fuzzy formalization for image segmentation. The problem is formalized in (14) and its solution is in (24). 2) A novel parameter-free de-noising approach that specifically combats intensity inhomogeneity. The algorithm's structure is illustrated in Fig. 2. Experimental results on synthetic and real sonar images, obtained from different sonar systems such as side-scan, multi-beam and synthetic aperture sonar (SAS), show that the new algorithm is effective and efficient, as well as relatively independent of the sonar system and background type. This paper is organized as follows. A detailed literature survey for fuzzy segmentation and image de-noising is presented in Section II. The system model and assumptions are outlined in Section III. The EnFK method for image de-noising and segmentation are detailed in Section IV, and Section V, respectively. Results and analysis of a database of simulated sonar images, real sonar images, and of sonar images collected during our sea experiments, are discussed in Section VI. Finally, conclusions are drawn in Section VII.

II. LITERATURE SURVEY
In this section, we survey the state-of-the-art methods for the main components of our contribution, namely, fuzzy-based image segmentation, and image de-noising.

A. Fuzzy-Based Segmentation
Fuzzy algorithms have been widely used for image segmentation. The Fuzzy C-means (FCM) algorithm [9] is a popular method due to its simplicity and fast convergence. Unlike hard-clustering methods, where each pixel is assigned a single label, the FCM allows pixels to belong to multiple labels with different membership degrees, and measures the similarities between each pixel in the image and the center of the clusters. In the geometrically-guided FCM (GG-FCM), proposed by Noordam et al. [13], geometrical information is used during the segmentation process. Szilagyi et al. [11] proposed the enhanced FCM (EnFCM), which generated a linear weighted image from the original image and the mean filter image. Then, to reduce the computational time, the segmentation is performed on the histogram instead of the pixels. Cai et al. [10] proposed the fast generalized FCM (FGFCM). This algorithm generates a nonlinear-weighted image from the original image, local spatial texture, and the gray level neighborhood. To exploit spatial information between the image's pixels, Ahmed et al. [14] proposed the FCM_S, which adds spatial information, at the cost of high complexity, thereby allowing the labeling of a pixel to be influenced by the labels of its surrounding pixels. Instead, Chen and Zhang [15] used mean and median filters in the segmentation process to incorporate spatial dependency between the pixels. The trade-off between the effectiveness of preserving an object's borders and robustness to noise is controlled by a parameter a, determined by trial-and-error.
For greater robustness, Krinidis and Chatzis [16] proposed the fuzzy local information c-means (FLICM) method, which is free of any parameter selection. In this method, a novel fuzzy factor is introduced to replace the above parameter a. This factor incorporates local spatial and local intensity to improve robustness to noise and outliers. This Euclidean metric may not fit images with high-intensity homogeneity like sonar images. A variant of the FLICM was proposed in [17], which replaced the spatial distance with a local coefficient of variation as a local similarity measure. More recently, Shang et al. [12] proposed the clone kernel spatial FCM (CKS-FCM), where the initialization of the cluster centers is set by mimicking the biological process of the acquired immune clone. This method has been found reliable for non-convex optimization and showed robustness to noisy images [18]. However, the complexity of the CKS-FCM may be too high for the online processing of sonar images, since the initial cluster centers using the immune clone process are performed in addition to the fuzzy clustering. In addition, the objective function of [12] also contains free parameters that are fine-tuned in advance, thereby effecting robustness.

B. Image De-Noising
For accurate segmentation, removing noise components without distorting the object's borders is essential. Wiener filtering and wavelet transform [19] are typical approaches for image de-noising. Yet, these approaches are mostly applicable to cases of transient noises. In the NL-means filter introduced by Buades et al. [20], each pixel value is restored by the weighted average of all pixels' intensity in the image. The weight of a pixel is determined according to the similarity between the pixel's local neighborhood and that of the other images' pixels. A similar approach is performed in 3-D filtering (BM3D) [21], where the non-local filtering is combined with Wiener filtering. However, our results showed that these filters are appropriate for use in the case of additive white Gaussian noise, which is less compatible with the embedded noise in sonar images. Coupe et al. [22] proposed the NLMSF method, which can preserve the objects' borders in ultrasound images.
Our literature survey shows that image segmentation and image de-noising are well-investigated subjects. Yet, for sonar image segmentation, we identify some remaining gaps. These include the sensitivity of the existing segmentation methods to parameter selection, as well as their sensitivity to intensity inhomogeneity and different seabed textures. For image denoising, we argue that a proper systems model -which can accurately reflect the statistical behavior of the noise in the shadow, background, and highlight regions in sonar imagesis still necessary.

III. SYSTEMS MODEL AND MAIN ASSUMPTIONS
Let Y be a two-dimensional sonar image with dataset {y 1 , ..., y N } ⊆ Y, where y i denotes the intensity of pixel i . Each pixel i has one of three possible labels l i ∈ {S, H, B}, where S and H are the shadow and the highlight of objects found in the image, respectively, and B is the background. Our aim is to accurately identify the image's shadow and highlight regions. The noise in the sonar image is modeled as an additive component, and the statistical distribution of the noise is region-dependent. We justify this because the noise in the shadow region is the electronic noise from the receiver. Further, as performed in [23], [24], since de-noising is extracted on local blocks, we follow the additive noise model for local statistics. The noisy image is where ξ i is a conditionally independent additive noise with a probability density function (PDF) p ξ (ξ ) and x i is the true image pixel. In the EnFK method, we model p ξ (ξ ) according to the pixel's label. Recall that a shadow region is created when the object is blocking the acoustic reverberation. The signal related to the shadow region consists of the electronic noise from the receiver. Thus, in this region, the noise is modeled by a zero-mean Gaussian distribution [24]. Following [25], the noise in the background and the highlight regions is modeled by the exponential distribution. Observing different sonar images, we found that this choice of statistics offers a better distinction between pixels related to the background vs. pixels related to the object, as opposed to, e.g., Gaussian distribution of different parameters per class [26], or the Weibull distribution [5]. Moreover, the results also demonstrate that this distribution model is sufficiently valid to provide accurate de-noising results. We model the PDF of ξ i by where λ, is the exponential distribution parameter, and σ is the standard deviation for the shadow regions. Since the noise ξ is induced by the process of constructing the sonar image, we assume that λ and σ from (2) are constant per image.

IV. IMAGE DE-NOISING
The image de-noising process is performed to reduce the intensity inhomogeneity of the sonar image. This objective is of importance for the task of object identification, where not only the detection of the object is of concern, but also maintaining its observed shape to ease the classification procedure.
The key idea in the proposed de-noising scheme is to use the Bayesian approach in [22] to tie different statistical models to the sonar images' different regions, namely, shadow, highlight, and background. As will be shown in the Results section, this approach leads to improved results in terms of assessment index (Q) [22], which reflects the region homogeneity level. Moreover, a novel method to self-evaluate the de-noising accuracy is introduced, thus avoiding the need to initialize the de-noising process.

A. The De-Noising Filter
Our de-noising is based on the NLMSF in [22]. For the sake of completeness, we briefly describe the main idea of the NLMSF. The NLMSF is a despeckling method that utilizes a dedicated speckle model to handle the spatial speckle patterns in the image. The blockwise Bayesian estimator x(B i ) is defined as [27] x( where B i is a square block of size T equals (2α + 1) 2 (α ∈ N) centered at pixel i , i is a square search block centered at pixel i of size | i | = (2M + 1) 2 (M ∈ N), y(B i ) is a T × 1 vector that contains all observed intensities of the pixels inside block B i , and x(B i ) is a T × 1 vector of the unobserved (unknown true image) intensities of the pixels inside block B i . By the speckle model in [22], the statistical distribution of y i |x i is where, γ is the speckle parameter, and σ is the standard deviation of the gaussian noise in the speckle model. Assuming independence among the pixels, the likelihood of y(B i )|y(B k ) can be factorized as: where y k are the tth component in y(B i ) and y(B k ), respectively. The restored intensity of pixel i is given by the mean of all restored values in the blocks B i in which y i is included. To improve the results and speed up the algorithm, a pixel selection scheme is used [28], which is controlled by the thresholding parameter μ 1 .
The free parameters γ , μ 1 affect the robustness of the NLMSF to seabed intensity inhomogeneity. This is because of the need to tune these parameters for different background types. Considering this challenge, we add to the NLMSF scheme the capability to include the additional distribution types in (2) and to self-evaluate the distribution parameters. This is described in the following subsection.

B. Estimation of Distribution Parameters
We make the realistic assumption that the sonar image is intensity inhomogeneous. Under such conditions, we expect x i from (1), i.e., the noise-free image components, to be different at various locations of the image. To compensate for the location-dependent x i , the image is divided into non-overlapped blocks Y r of size κ s and perform parameter estimation per block. Then, modeling the distribution of the noise components to be the same for the whole image, the estimation from all blocks is fused into a single one. We note the choice of κ s tradeoffs. Small values of κ s may enlarge the estimation error of the parameters of the distributions, while large values of κ s degrade the performance of the despeckling filter in the shadow zone because large blocks contain not only shadow pixels, but also background information. We leave the choice of κ s to the user based on the size of the object of interest.
As a model (2) reveals, the parameter estimation process must include labeling information. That is, each block must be pre-clustered into one of the possible labels {S, B, H }. Let c r be the label of the r th block. The parameter c r is determined based on the majority of the pixels' labels in the r th block. These pixels' labels are evaluated based on the initialization algorithm from [29]. While the initialization performed well for real sonar images, because it is a model-free algorithm, it may induce some clustering errors. Still, the impact of initialization errors is low because it is used only for estimating λ and σ as an average overall initialized windows. Thus, initialization segmentation errors in some windows would have a small effect.
Once c r are determined, we statistically evaluate the parameters in (1). For blocks with c r = B, H , we evaluate parameter λ in the r th block by where Similarly, for block r with c r = S , we set Then, following our assumption that the noise term parameters in (1) are constant throughout the sonar image, we follow the metric in [30], which we find to be the most suitable for dealing with outliers in sonar images, and fuse all per-block estimations as a weighted sum: and where ρ b is the number of blocks labeled as highlight or background. The fusion of parameter σ is performed in the same fashion.
C. Setting the De-Noising Filter 1) Formalization: The flowchart of our de-noising scheme is shown in Fig. 1. The de-noising is carried out separately for each block Y r . Since the blocks' NLMSF size, T and | i | are much smaller than the original image size, for block denoising, we assume that the labels of the pixels in Y r are identical. Based on the model in (1), the enumerator of (3) can be rewritten as Based on the distribution model in (2) p ξ ξ equals p s or p b according to the label of the block Y r . That is, to calculate (12), we require information about the label of each block. Unlike the parameter estimation process, which tolerates errors in the block's initial clustering, in this case, de-noising is based on the erroneous identification of the block's label; therefore, usage of a wrong distribution model in (2) will likely lead to image distortion. Thus, unlike parameter estimation, regarding block de-noising, we avoid using segmentation initialization. Instead, for each block, we find the distribution that leads to the best de-noising result. 2) Self-Evaluation of De-Noising Performance: We measure successful de-noising using the concept of minimum entropy. This is because the block's entropy characterizes the distribution of the restored intensities. In particular, a very localized region will lead to small entropy, while a uniform region will lead to high entropy. Thus, setting the minimum entropy as a quality measure will lead to the choice of the best-localized values, which is the choice with the most homogeneous intensity of the de-noised block. The entropy is calculated by where p(i ) is the number of pixels (after normalization) in the r th de-noised resulting block at the i th intensity bin.
With the uncertainty of the block label, each block is de-noised using (12) for both p b (ξ ) and with p s (ξ ), to create the de-noised images x b and x s , respectively. Then, the entropy is calculated for each of the resulting images, and the chosen de-noising result is the one of minimum entropy.

V. ENHANCED FUZZY-BASED IMAGE SEGMENTATION
The uniqueness of the proposed sonar image segmentation method lies in the introduction of two new terms: the local second moment term, and the between-cluster term. As shown in the Results section, using these two terms lead to better segmentation accuracy and faster convergence. Inspired by the echo detection process in [24], we use the second moment of the image data in the kernel space to better separate the object's highlight from its background. The between-cluster term represents the error between the prototypes and the empirical cluster centers and is incorporated, in the kernel space, into the objective function to reduce false segmentation. Furthermore, as quality metrics, we use the kernel distance [31] rather than the commonly used Euclidean distance between pixels. This is because, in the kernel space, the formed clusters are more spherical and can, therefore, be more easily clustered [32].
The kernel maps a data set U into a higher dimensional space W (kernel space) via the transform function : U −→ W. The fuzzy algorithm generates the degrees of membership u ki , i.e., the probability that the label of pixel i belongs to the k-th cluster, by minimizing the objective function where, c is the number of clusters. For the three possible classes {B, H, S}, our objective function is defined as G ki is referred to as the fuzzy factor [16] defined by where d i j is the Euclidean distance between pixel i and pixel j , m is the weighting exponent on each fuzzy membership, which determines the fuzziness of the results, x i is the de-noised image, (x i ) − (V k ) is the local second moment term in the kernel space and x i is given by: where N i is the local window of size (2β + 1) 2 centered at pixel i , V k c k=1 are the center of the clusters, ( x k ) − (V k ) is the between-cluster term in the kernel space, and x k is the average of all pixels assigned to the kth cluster, such that where n k is the number of pixels with an assigned label equals the kth cluster. A kernel in the data space can be represented as where · is the inner product operation. Due to the fact that in (15) is unknown, the segmentation problem should be solved using only the kernel function. Since k(x, x) = 1, the inner product in the kernel space, can be written as Using (20), (15) is rewritten as The solution to (14) is obtained by choosing the values u ki , which minimize (21). Using a Lagrange multiplier, the solution comes readily as (22) and (23), shown at the bottom of the next page.
Finally, pixel i is assigned to the cluster with the highest membership We adopt the Gaussian radial basis kernel function (GRBF) [33] with the kernel function K (x, y) = exp(− x − y 2 /r 2 ), where r is the kernel's bandwidth, and r is set on the basis of the distance variance of all pixels [30]. We choose the Gaussian radial basis kernel because of its robust estimation controlled by the single bandwidth parameter r, and since it is able to catch non-linear connections between the observed data and the classed labels. The bandwidth r is given by The data center u is given by: The distance from pixel i to the data center is d i = y i − u . The mean distance d is calculated by We solve (14) iteratively. In each iteration, x k is set by (18) according to the segmentation solution from the previous iteration. Then, the prototype V k is calculated by (23) and plugged into (22) to yield a new estimation (24). The process stops upon convergence, i.e., when where u (q) ki is the membership at qth iteration, is a convergence parameter, and N q is the maximum number of allowed iterations. A numerical proof of the convergence of the above procedure appears below. The labels of the pixels in Y are initialized using our LSM algorithm [29] and denoted by with ρ s equals the number of pixels with initialized label equals S and analogously for V 3 , which are related to the highlight and the background, respectively.
The proposed segmentation process is illustrated in Fig. 2. The EnFK method comprises six main steps: 1) Clusters initialization using the robust LSM initialization method; 2) Image de-noising using the improved NLMSF-based algorithm; 3) Calculating the local second moment {x i } N i=1 ; 4) Updating the membership matrix; 5) Updating the cluster prototypes; and 6) Stopping criteria.

VI. EXPERIMENTAL RESULTS AND COMPARISON
In this section, the performance of the proposed image de-noising algorithm is presented, as well as the overall performance of the segmentation process. For de-noising, we use κ s = 255, and compare the results of our algorithm with those of the robust NLMSF filter in [22] with γ = 0.5, α = 1, and M = 3, which we found to be the most suitable for sonar image de-noising. In this work, we focus on fuzzy sonar image segmentation. Therefore, the proposed method is compared with the state-of-the-art in fuzzy segmentation. In particular, we choose as benchmarks methods FCM_S1 [15], FCM_S2 [15], KFCM_S1 [34], KFCM_S2 [34], FLICM [16], and fast and robust fuzzy C-Means (FRFCM) [35]. For completeness, we also add to the benchmark a non-fuzzy segmentation method based on the Dempster-Shafer evidence theory (EDSM) [36]. We choose these methods as benchmarks since they are both a key in image segmentation and are heavily cited, and because of their suitability for sonar image processing. Further, these methods represent different approaches in fuzzy image segmentation. Specifically, • FCM_S1: Utilizing the mean filter in the objective function to increase robustness to noise.
• FCM_S2: Utilizing the median filter in the objective function to compensate the intensity inhomogeneity. • KFCM_S1: Extension of FCM_S1 that maps the original input into the kernel space to increase separability of the data. • KFCM_S2: Extension of FCM_S2 to the kernel space. • FLICM: Control the influence of the local neighborhood pixels on the labeling without any parameter selection. • FRFCM: The local spatial information is incorporated into the fuzzy objective function by morphological reconstruction. For all fuzzy-based segmentation methods, the fuzziness index is set to m = 2, and β = 2. The maximum number of iterations N q is set to 250 and the threshold = 0.01. According to [34], the parameter a is set to 5 for FCM_S1, FCM_S2, KFCM_S1 and KFCM_S2. According to [35], the morphological reconstruction parameter is set to 3, and the size of the filtering window is set to 7. For EDSM, we set γ 1 = 0.1 and γ 2 = 1.4 according to [36].

A. Sonar Data
Our sonar data comprises two data sets. The first image set consists of three synthetic sonar images, presented in Fig. 4(a)-(c) (top images). The images contain a cylindrical object with different backgrounds: sand, sea-grass, and sand ripples. The size of the synthetic images is 120×120. Sand and sea-grass seabed textures are generated according to the models in [25], while the sand ripples texture is generated according to [37]. The object's region and background are synthesized separately. The object's intensity level is modeled by a gamma distribution with mean values and standard deviations of 120, 10, for the shadow region, and 1, 0.1 for the highlight region, respectively. Similar to the model in [38], the object and background are superimposed, as follows: if i ∈ background region (30) where χ i,h , χ i,b and χ i,s stand for the intensity level of the i th pixel in the highlight, background and shadow regions, respectively.
The second set consists of five real sonar images: the first image was imaged with CM2 towfish sonar and is of Crab Traps [39]; the second is Airplane and was made with the Sea Scan 600 sonar [40]; the third is Drowning Victim and was made with the EdgeTech 4125 sonar [41]. The sizes of these three images are 151×301, 182×232, and 182×232, respectively, and are given in Fig. 7(a)-(c) (top images). The last two images consist of two sonar images we sampled during our sea experiment. These images include two steel targets ( Fig. 11(b)), and a submerged gas well (Fig. 11(e)), and are of size 200×120, and 120×300 pixels, respectively.

B. Evaluation Indexes
To calculate the quantitative assessment of the de-noising and segmentation results, ground-truth maps are usually needed. These maps are generated by manual segmentation [4] based on the original sonar images. We use the variation information VI [42], partition coefficient v pc , partition entropy v pe [43], and MCR [44] for the segmentation results' evaluation, and despeckling assessment index Q [22] for the de-noising results' evaluation. The VI measures the dissimilarity between two maps in terms of information entropy. In our case, the first map is the segmentation results map S r and the second is the ground-truth map G t , where, with N k is the number of assigned pixels with k label and N k,k is the number of points in the intersection maps S r and G t considering the labels k and k , respectively. If both maps are identical, the entropies H (S r ) and H (G t ) are equal and V I (S r , G t ) = 0. Partition coefficient v pc and partition entropy v pe are defined as [43]   , yellow -Exponential (for highlight/background regions). We observe that our approach yields a much better de-noising than the benchmark with an accurate identification of the blocks containing the object's shadow.
Best clustering is achieved when v pc is close to one and v pe approaches zero. Despeckling assessment index Q is defined as [22] where, μ j and σ j are the mean and the variance of the pixels' intensities with assigned j th label after de-noising.
To calculate the despeckling assessment index, we use the ground-true map for pixels' label information. The higher the value of Q, the better de-noising results are achieved. MCR is defined as the number of misclassified pixels normalized by the total number of pixels and is given by [44] where I is an indicator function equals one if its argument is true, and zero, otherwise. Segmentation improves, the lower the MCR is.

C. Results on Synthetic Images
1) De-Noising Results: In Fig. 3, the de-noising results obtained for a synthetic image of a cylindrical object in a sea-grass background is presented. It can be seen that the proposed de-noising method achieves higher homogeneous regions compared to the NLMSF. Clear identification of the shadow and highlight/background regions can be seen in Fig. 3(d). Table I shows the values of Q for the synthetic sonar images. The experimental results reveal that both methods preserve the object's boundaries. A relatively small advantage in terms of the despeckling index Q is observed for the proposed method. However, as shown later on, for real sonar images, the performance gap significantly increases.

2) Segmentation Results:
To statistically test the segmentation performance, we performed 1,000 Monte-Carlo simulation runs. In each run, the image was corrupted by a speckle noise with a variance of 0.09. Fig. 4 shows the segmentation results for the benchmark schemes compared with the EnFK algorithm. The segmentation results are illustrated by three colors: white for highlight, black for shadow and gray for the background. For a sandy background ( Fig. 4(a)) FLICM, FCM_S1, EDSM and the EnFK obtain accurate results. For a sea-grass background (Fig. 4(b)), poor segmentation performances are observed for the benchmark schemes, whereas EDSM and EnFK performance is almost not affected. Still, EnFK has better region uniformity and less misclassified region compared to the results of EDSM. The same result is observed in Fig. 4(c), where we show segmentation results for a sand-ripples background. Here, most benchmarks completely fail to segment the object, whereas the FLICM fails to identify the object's shadow region and wrongly assigns the background region with the shadow label.
The statistical evaluation for the four quality metrics for the Monte-Carlo simulations is presented in Fig. 5. The EnFK achieves the lowest VI for the sand background as it is considered the easiest case for segmentation among the three background types. For all types, EnFK produced the best performance by obtaining the lowest VI. From the results of Fig. 5(b) and Fig. 5(c), we observe that, for the EnFK, the value of v pe is closer to 1 than that of the benchmarks, and that v pe is closer to 0, which implies that the EnFK achieves better segmentation accuracy. For a sand background, EDSM achieves the best MCR results. However, for more complex background like the sea-grass and the sand-ripples, EnFK obtains more appropriate and satisfying segmentation results.
In addition, the MCR results in Fig. 5(d) also confirm that our method is more accurate compared to the benchmark with an MCR level of less than 3% for all images.

D. Results on Real Sonar Images 1) De-Noising Results:
The de-noised version of the Crab Trap image is shown in Fig. 6. High-intensity inhomogeneity can be clearly seen in the original sonar image in Fig. 6(a). Accurate identification of the shadow and highlight/background regions can be seen in Fig. 6(d). To allow for a quantitative comparison between the de-noising methods, in Table II, we show the despeckling assessment index Q for Fig. 6. Our method produces the highest values of the assessment index compared to the NLMSF, which reflects the homogeneousness of the image after de-noising.
2) Segmentation Results: Fig. 7(a) shows the segmentation results for the Airplane image. Obvious false segmentation exists in the results produced with FCM_S1, FCM_S2, KFCM_S1, KFCM_S2. When using these methods, the homogeneity of the background is corrupted by false segmentation,   Fig. 7(b). We observe that both the FLICM and EnFK correctly identify the object's shadow as well as its highlight. However, the benchmarks FCM_S2, FCM_S1, KFCM_S2 and KFCM_S1 contain many cases of false segmentation in the image's background. Moreover, the EnFK has the best segmentation results with better region uniformity and misclassified regions. EDSM and FRFCM fail to identify all objects' highlight. Similar results are obtained for the Drowning Victim image in Fig. 7(c). The segmentation produced by the FLICM and FRFCM algorithms includes many cases of false segmentation, and the smoothness of the object's boundaries is corrupted. The boundaries in the segmentation results obtained by EDSM and EnFK are well defined. But, EnFK captures a few misclassified regions. In comparison, the EnFK produces accurate segmentation with better region uniformity. A quantitative comparison is presented in Fig. 8. For the variation information VI (Fig. 8(a)), results show that the EnFK has the lowest VI value compared to the benchmark. These results indicate that our method generates more uniform segmented regions and more accurate cluster boundaries than the compared algorithms. Figs. 8(b) and 8(c) show the measures of the partition coefficient v pc and the partition entropy ν pe from (33). We observe that the partition coefficient value of our method is much closer to one than that of the other algorithms. Similarly, the partition entropy value of our method is significantly closer to zero than that of the benchmark algorithms. Both results indicate that the EnFK generates more separable clusters than the benchmark methods.
3) Effect of the Between-Cluster and Local Second Moment Terms: These terms are used to improve the convergence rate of our algorithm, as well as the segmentation performances. Fig. 9(a) shows the number of iterations until convergence of two versions of the EnFK: one with the two new terms and another without them. We observe that the use of these terms dramatically reduces the number of iterations for all tested images. Moreover, the variation information VI shown in Fig. 9(b) demonstrates that the VI values obtained when using these terms are significantly better compared to not using these terms.

4) Effect of De-Noising:
The effect of sonar image de-noising on the final segmentation results is analyzed in Table III. We measure the efficiency of the de-noising process in terms of the MCR from (35). We test three versions of our fuzzy segmentation algorithm: In version A, the input for the first term of (15), x i , is replaced with y i (no de-noising); in version B, the de-noised data, x i , is produced by the NLMSF; version C is provided by the EnFK. The large deviation of MCR among the sonar images is mainly due to the different level of intensity inhomogeneity in the images. While the use of the NLMSF de-noising has a very minimal effect, we observe that the EnFK achieves the lowest values of MCR. This means that false segmentation, caused by the intensity inhomogeneity in the background, is effectively reduced by our method.

5) Effect of De-Noising Block Size:
To compensate for the location-dependent in the sonar image, we divide the image into non-overlapped blocks of size κ s in the de-noising process. The effect of κ s on the final segmentation results and the de-noising performance in terms of MCR and Q respectively is analyzed in Fig. 10. To that end, we use the Crab Traps from Fig. 6(a). The tradeoff in the choice of κ s is observed in Fig. 10(a). While small values of κ s may enlarge the estimation error of the parameters of the distribution, large values of κ s degrade the performance of the despeckling filter because large blocks contain pixels from different regions. For de-noising, we observe a deviation of 30% in the despeckling assessment index Q. For the overall segmentation results, we notice a deviation in MCR results of about 1% for different values of κ s .

E. Sea Trial Results
To further validate the performance of the EnFK, we now introduce results from two sea experiments that we performed  (35). The EnFK exceeds the compared algorithms by 48%, 17%, 76%, and 82% on average over all images for the VI, v pc , v pe , and MCR measures, respectively. with our own sonar system. In contrast to the previously explored sonar images, this data allows us to compare performance with real sea conditions.

1) Experiment Description:
In the first experiment, a 400 kHz multi-beam sonar, EM 2040, mounted on the RV Bat-Galim, was used to scan two targets of truncated cones,  shown in Fig. 11(a). This experiment was performed roughly 2 miles west of Haifa Port in Israel. The targets were placed on a rocky seafloor at a depth of 25 m. The second experiment was conducted about 10 Km west of northern Israel at a water depth of 1,000 m. In this experiment, we deployed our A18 5.5m Eca Robotics Inc. AUV [45], and scened the ground using the vehicle's Kraken-made two-sided synthetic aperture sonar (SAS) to scan the seabed for opportunities objects. A SAS image of a found gas well is presented in Fig. 11(e).
2) Experiment Results: Fig. 11 shows the de-noising results for the NLMSF and EnFK for the two sonar images. Both methods preserve the edges of the targets. However, the EnFK better smooths the background and, as observed from the resulting despeckling index Q, the final de-noised image is more homogeneous.
In Fig. 12, the segmentation results are introduced for the multi-beam image. This image contains one cone-shaped target. We observe that the segmentation results obtained by FCM_S1 (Fig. 12(b)), FCM_S2 (Fig. 12(c)), KFCM_S1 ( Fig. 12(d)) and KFCM_S2 (Fig. 12(e)) yield poor performance in terms of region uniformity, and that the FLICM ([ Fig. 12(f)]) fails to identify the target's shadow regions. In FRFCM and EnFK, the region uniformity is good, and the boundaries between regions are clear. But, the EnFK ( Fig. 12(g)) manages to separate the target's shadow from its background entirely. For the SAS image, Fig. 13 shows that the shadow region is successfully segmented by all methods. However, the considerable intensity inhomogeneity in this image leads to many false segmentation regions using all methods. Still, the EnFK produces, by far, the lowest false segmentation rate. The simulation and experimental results reveal that the EnFK dramatically reduces the misclassified regions in sonar images, and can generate clean and accurate clusters of shadow and highlight. This result has a tremendous effect on object classification performance, which is the next step in the ADAC detection chain.

VII. CONCLUSION
In this paper, we proposed a fuzzy-based image segmentation method for sonar imagery, referred to as EnFK.
To improve background homogeneity, the EnFK includes a novel image de-noising step which, together with the original image, feeds into the segmentation process. To reduce false segmentation, as part of the fuzzy objective function, we introduced two novel terms. Simulation results show that the use of these two terms dramatically reduces the number of iterations until convergences and that the false segmentation  rate decreases. Experiments over a broad set of Monte Carlo simulated sonar images, three cited real sonar images, and two self-measured sonar images from a multi-beam and a SAS show that EnFK produces high segmentation accuracy. The experimental results show that the highlight region in the segmented image of objects in multi-beam sonar images is barely detected. Future work will deal with this problem to enable more accurate segmentation.