Performance Enhanced Ultrasound Probe Tracking With a Hemispherical Marker Rigid Body

Among tracking techniques applied in the 3-D freehand ultrasound (US), the camera-based tracking method is relatively mature and reliable. However, constrained by manufactured marker rigid bodies, the US probe is usually limited to operate within a narrow rotational range before occlusion issues affect accurate and robust tracking performance. Thus, this study proposed a hemispherical marker rigid body to hold passive noncoplanar markers so that the markers could be identified by the camera, mitigating self-occlusion. The enlarged rotational range provides greater freedom for sonographers while performing examinations. The single-axis rotational and translational tracking performances of the system, equipped with the newly designed marker rigid body, were investigated and evaluated. Tracking with the designed marker rigid body achieved high tracking accuracy with 0.57° for the single-axis rotation and 0.01 mm for the single-axis translation for sensor distance between 1.5 and 2 m. In addition to maintaining high accuracy, the system also possessed an enhanced ability to capture over 99.76% of the motion data in the experiments. The results demonstrated that with the designed marker rigid body, the missing data were remarkably reduced from over 15% to less than 0.5%, which enables interpolation in the data postprocessing. An imaging test was further conducted, and the volume reconstruction of a four-month fetal phantom was demonstrated using the motion data obtained from the tracking system.

imaging [2]. It produces real-time images of structures within the human body or movement of internal organs by transmitting and receiving ultrasonic waves from ultrasonic transducers. The 2-D images are then reconstructed by a computer based on the echo signals. This imaging process is nonionizing, which is preferred when radiation sensitivity is a concern, such as for obstetrics [3]. Apart from being a noninvasive modality, US imaging is also portable, easily accessible, and cost-effective. However, barriers exist in the widespread clinical applications of US imaging since the image quality is highly dependent on sonographers' skills, knowledge, and experience [2]. Furthermore, due to the limited field of view, multiple frames are often required to cover the whole scanning region, which makes it more challenging for sonographers to imagine the 3-D view of an anatomical structure from a series of 2-D images [4].

B. 3-D US Imaging Techniques
The 3-D freehand US imaging has been proposed to overcome the above limitations [5]. Current 3-D US imaging methods can be categorized into three different types: mechanical scanning [6]- [8], 2-D array scanning [9]- [11], and freehand scanning [12]- [16]. In mechanical scanning, the US probe is attached to a six-degree-of-freedom robotic arm for scanning. The position and orientation of the probe can be accurately calculated based on the geometry of the robotic arm and the encoder information. However, the robot-assisted scanning system is usually too bulky and heavy to use in clinical applications [17]. In the 2-D array scanning, the acoustic beams are steered electronically in both elevational and azimuthal directions to obtain a 3-D view of the anatomy. It is the fastest way to view the real-time 3-D images. However, the 2-D array transducers are difficult to fabricate and thus usually comes with a high cost [18]. The 3-D freehand US scanning is a promising imaging modality, which can be realized with or without a tracking sensor. A sensorless freehand tracking estimates the position and orientation from the US images themselves. It does not require a complicated dependence on the scanning protocol. However, the speckle decorrelation typically requires fully developed speckles [19] and only partially captures the underlying complexity of US image formation [15]. It is especially challenging to accurately estimate the probe motion, particularly out-of-plane motion, as different image contents can be caused by the probe motion or different tissues. Also, the decorrelation rate depends not only on the transducers but also on the medium [15], [20]. Gee et al. [12] further extended the speckle decorrelation algorithms by adapting the decorrelation curves to account for apparent coherent scattering. Another challenge is how to eliminate the error accumulation between the two adjacent frames. A sensorbased 3-D freehand US utilizes one or combinations of two or more of the following sensors: inertia sensors, acoustic sensors, electromagnetic sensors, and optical sensors. The inertial measurement unit (IMU) estimates the position and orientation by taking integral of the acceleration and angular velocities, respectively [21]. However, the positional tracking suffers from drifting [22]. A recent study by Prevost et al. [15] considered IMU sensors to be adequate for rotational tracking but insufficient for positional tracking due to their low signal-to-noise ratio and the required double integration. They turned out to achieve the 3-D reconstruction of freehand US sweeps using a convolutional neural network with orientation data from an IMU sensor. Acoustic sensors solve the target positioning using the sound speed, receiver positions, and measured time-of-flight or time-difference-of-flight [23]. Thus, it can be applied in dark environments such as in a US exam. However, to maintain a high signal-to-noise ratio, the tracking volume is limited. Also, there should not be any obstacles between the transmitters and the receivers [24]. Among all existing tracking techniques, optical tracking and electromagnetic tracking are more popular in the indoor environment, especially in clinical applications. In an obstetrical exam, large metal objects are usually inevitable. Therefore, Lang et al. [13] tried to fuse the speckle decorrelation method with the electromagnetic sensor to correct the metallic distortions of the electromagnetic sensor. Compared with electromagnetic tracking, optical tracking is more preferred in some cases. Optical tracking uses a camera system to determine the real-time position and orientation of an object either by tracking a set of markers or from RGB (red, green, and blue) images. The former uses a fixed camera system to track a set of active or passive infrared (IR) markers attached to the object. The latter is a mimic of human eyes that captures depths information from the RGB images [25]. Therefore, it requires a good illumination condition. In fetal US, the lighting is usually controlled by a combination lighting system (dim/bright). In most cases, the room is dark so that the sonographer can view the images clearly on the device's display. In situations like this, a camera system may struggle to capture useful RGB images to calculate the depth information. Therefore, optical tracking with IR markers is more suitable. Active markers are battery-powered and thus are usually heavier than passive markers [26]. To ensure continuous tracking during the operation, passive IR markers were adopted in this research.

C. Objective
Although optical tracking is accurate, the tracking performance can be limited by occlusion, which is discussed in Section II. The objective of this study is to improve the tracking performance of the camera-based tracking by replacing the 2-D marker rigid body with a newly designed hemispherical marker rigid body. With most motion data captured, the occlusion issues can be mitigated. Thus, the reconstruction from 2-D frames to 3-D volume is expected to possess more motion information to reflect the structure of the imaging object.

A. Principle of Camera-Based Tracking
Theoretically, a point in the 3-D space can be located by triangulation if it can be seen by two or more lenses simultaneously. Fig. 1(a) and (b) shows the schematics of the tracking principle. O 1 , O 2 , and O 3 are the origins of three lenses whose principal axes are parallel with each other. The line connecting each two of the three lenses is the baseline. The baseline is perpendicular to the principal axes. In this article, the 3-D world reference coordinate (C) was chosen to be the righthanded, y-upcoordinate system, where the origin was set to coincide with the center of the cameras by default. The 2-D marker position can be easily obtained in the image coordinate. Then, using the 2-D projection obtained from multiple lenses, the perception of the depth information could be estimated from the disparity of 3-D points in the different images. For example, when a point P in the 3-D space is viewed by the left two lenses, the position (X, Y, Z ) could be calculated as follows [27]: where f is the focal length of all the three cameras, l 12 is the baseline between the first and the second lenses, and x p1 and x p2 is known as the disparity and represents how much the left image is displaced with respect to the right image. With markers' position known, the marker coordinate (M) associated with the rigid body created from the markers is also known. Therefore, the yaw (α), pitch (β), and roll (γ ) in the rotation matrix R can be deduced from the homogeneous transformation matrix T C M (representing the transformation from coordinate M to C) as follows: where R, as shown at the bottom of the page, and is the origin of the marker coordinate M expressed in the reference coordinate C. With R i j denoting the entry that lies in the row i and column j (i, j ∈ {1, 2, 3}), the orientation can then be obtained from the entries of the rotation matrix, where Optical tracking systems using spherical markers coated with the IR light retroreflective material can achieve up to submillimeter tracking accuracy [16], [28]. The cable-free connection between the probe and the external tracking device allows for flexibility of the sonographers. It also has the potential to simultaneously track multiple targets, such as both the probe and the patient, in the event that the patient moves during the scanning.

B. Limitations of Camera-Based Tracking
One major disadvantage of camera-based tracking is that it requires direct line-of-sight, which is not possible if the object is out of the tracking volume [29], if there is occlusion [30], or if the markers are damaged [26]. In the application of the 3-D freehand US imaging, stepping out of the tracking volume is usually not problematic given the large pyramid tracking volume (compared with the operation space) provided by an up-to-date optical tracking device such as the one shown in Fig. 2. A damaged marker may not be identified by the circle filter if there is damage on the retroreflective surface resulting in not satisfying the user-defined brightness threshold. Therefore, during the operation and maintenance, operators are responsible for protecting the markers' surfaces, checking the markers' conditions, and replacing damaged markers before each use. Finally, and most frequently, occlusion may occur, although its occurrence depends greatly upon target geometry, marker size, and camera setup [28]. Occlusion can be further divided into two categories: self-occlusion (where markers are hidden by other markers) and background occlusion (where the marker rigid body is obscured by other objects).
To alleviate the occlusion, a multisingle-lens camera system may be adopted to ensure that the tracking volume is covered by multiple cameras arranged in different locations instead of an integrated camera bar [16]. This solution improves tracking robustness but introduces new challenges with installation, calibration, and cost. Given moderate probe motion in a US exam, the occlusion issues can instead be addressed by adopting a designed marker arrangement. In a 3-D space, a minimum number of three markers are required to define a rigid body in space. Additional markers can increase the robustness by mitigating the likelihood of occlusion. Therefore, the marker rigid body usually consists of four or more markers. To avoid "flipping" during the tracking, the marker rigid body should be arranged asymmetrically.
range of the 2-D marker rigid body was within 90 • . If the US probe is operated out of this range, some of the markers may be hidden, and therefore, the entire rigid body cannot be identified by the camera, resulting in inaccurate tracking or failed tracking. Since the rotation of a probe is not limited or predefined in a real US exam, we aimed to enable true unlimited freehand movement while retaining similar materials cost, using a hemispherical marker rigid body. A total of thirteen unoccupied holes were designed on the hemisphere to hold the rods attached with passive markers. One is centered at the top of the hemisphere; the others can be divided into four groups, with each three sharing the same radius and the same angle between the central axis of the hemisphere and the axis of the hole. Fig. 3 shows the 3-D model of the designed rigid body with hole axes angled at 30 • , 45 • , 60 • , and 90 • . By choosing the target holes and proper rod lengths, the attached markers may be separated asymmetrically so that if any one marker was hidden or partially hidden by another marker, the remaining markers are still at a minimal distance of 10 ± 1 mm. In this way, the self-occlusion issue may be mitigated. Five markers were attached to the marker rigid body with different rod lengths, as shown in Fig. 4(b). Considering the general rotation of an object, the pointing angle was defined in a spherical region. During US exams, the US probe is typically rotated by the sonographer in the upper hemispherical region. Therefore, the rigid body was designed with a hemispherical surface superiorly to hold the markers and a circular bottom with a rigid rod to connect the marker rigid body with the probe. According to the characteristics of the probe rotation and the geometry of the rigid body, a marker that is inserted in the top hole with a certain rod length can be viewed by the camera most often and was thus considered the primary marker. To avoid the marker rigid body touching the patient's gravid abdomen during the operation, placing the markers on the top region of the rigid body rather than the side region is  preferred. Therefore, the remaining markers were positioned in a descending spiral manner. Three markers occupied the three 30 • -holes. Based upon the rules of minimizing occlusion and asymmetric arrangement, different rod lengths were selected experimentally so that the four markers were not coplanar within the camera's view. Rod lengths were selected for which the marker rigid body would not touch the patient's abdomen when rotated during the exam. With four markers inserted on the top region of the hemisphere, the final marker was located to locate at a 90 • -hole, adding to the asymmetry and spreading the marker layout by increasing the Euclidean distance between markers to make the tracking more robust. The selected rod length for each of the markers attached to the 3-D hemispherical rigid body was 35.16, 107.86, 130.07, 73.93, and 84.58 mm. The resulting noncoplanar marker distribution is shown in Fig. 5. With this asymmetric marker distribution, the attached five markers were placed on concentric circles with different radii, ensuring proper distance between every two of the five markers so that none of the pairwise markers were too close for the camera to detect.
B. Data Acquisition and Processing 1) Rotational Tracking Performance: A rotational tracking test was conducted to assess the tracking performance during the 360 • rotational motion. The test setups are shown in Fig. 4.
The data missing issue is addressed in Section IV (see Fig. 6). In the tests, the camera used for tracking the probe motion was OptiTrack V120: Trio (NaturalPoint, Inc.). With three lenses integrated, no calibration of the camera is required at the user end and the cost is relatively low compared with other commercial multisingle-lens camera systems. Both the 2-D marker rigid body and the hemispherical marker rigid body were attached to a step motor that had 200 steps per revolution. To imitate the rotational speed during a US exam, the stepper motor was set to rotate at 3 r/min. Both marker sets were rotated 360 • along each of the axes for five successive tests. Without loss of generality, each test started from a different initial orientation. The motion was captured by the camera at an update rate of 120 Hz. The accuracy was evaluated based on the totally recorded rotational angle and the difference between the initial and the final angle. For the recorded motion data, an ideal case was that the motion can be accurately captured with no missing data. However, the occlusion is known as an inherent drawback of optical tracking. Therefore, when occlusion happened, less missing data were better, serving as another criterion to evaluate the tracking quality. Missing entries were filled based upon its neighbors using the moving average. The window length was set to be 12 sample data points, which is equal to 0.1 s in time length, during which the probe can be considered still.
2) Translational Tracking and Volume Reconstruction: To validate volume reconstruction, a 3-D printed phantom [ Fig. 9(a)] was scanned with a Butterfly iQ probe (Butterfly Network, Guilford, CT, USA). The phantom was made of ABS-M30 (Stratasys, Ltd.). According to Menikou and Damianou [32], the acoustic impedance of ABS-M30 was 2.13 ± 0.08 MRayl. In the imaging test (see Fig. 7), the phantom was clamped in a water tank filled with degassed water. The frame rate was 25 Hz, whereas the update rate of the camera was a much higher 120 Hz. The probe was operated in the bladder mode with an imaging depth of 9 cm and a gain of 65%. The probe was fixed 22 ± 2 mm above the highest point of the phantom. Based upon the width of the phantom, the probe was translated 50 mm along the x-axis of the camera, from the left to right side of the phantom. The 2-D US frame was capture every 0.2 mm. Therefore, 250 frames were captured along with the motion data. Translational tracking accuracy was assessed by comparing each recorded step with the predefined one. Since the probe was translated along one direction, the other five degrees of freedom were set as constant values for a better reconstruction. The frames were then stacked based upon the tracked position using MATLAB and the tracking volume was then reconstructed.

A. Single-Axis Rotational Tracking Performance
Tables I and II summarize the overall tracking performance during single-axis rotation. The stepper motor was set to rotate one revolution at a time in each test. The recorded rotation was the measurement of the rotational angle from the camera. The difference was defined as the error between the initial and final angle readings. Thus, the ground truth for the recorded rotation and difference should be 360 • and 0 • , respectively. In addition,  I  COMPARISON OF SINGLE-AXIS ROTATIONAL  TRACKING PERFORMANCE   TABLE II  MEAN MISSING DATA PERCENTAGE OF  SINGLE-AXIS ROTATION as mentioned in Section III, a 2-D marker rigid body greatly limited the rotational range of the probe operation. When the probe exceeded the allowed range, the camera no longer recognized the whole rigid body; to avoid adding bias or noise, no motion data were recorded at that time. Accordingly, the tracking performance of the hemispherical marker rigid body in this study was evaluated based on the missing data percentage, defined by the following equation: where n missing is the number of missing data points and n total is the number of total data points. The test results shown in Table I indicated two similar tracking accuracy, and Table II demonstrated significant improvement in tracking robustness by the hemispherical marker rigid body. A missing data percentage of 100% in Table II means that the rotational tracking along the z-axis was completely disabled while using the 2-D marker rigid body. This was because when rotating about the z-axis, the marker rigid body plane was always perpendicular to the image plane of the camera. In this scenario, all the five markers were collinear and the rigid body created could not be identified by the camera no matter at what angle. Similarly, when rotating about the x-and y-axes, there was a rotational range where the markers were considered collinear by the camera. Thus, the missing data percentage was much higher than that of the hemispherical marker rigid body. A higher missing data percentage means less accurate tracking and harder to interpolate with a moving mean method. Fig. 6(a) shows one sample case of tracking the single-axis rotation using the 2-D marker rigid body. In this case, 20.64% of the data was missing and only a few data samples could be filled with the window length of 12 sample data points, some remained missing due to inadequate neighboring entries. However, with the hemispherical marker rigid body attached, only 1 data point out of 2810 total data points were lost. Therefore, the motion plotted in Fig. 6(b) was more reliable and the missing data points could be easily filled.

B. Single-Axis Translational Tracking Performance
As mentioned in Section IV-A, the probe was translated at a step of 0.2 mm. In total, 249 steps of translation were made  to translate the probe from the left side of the phantom to the right. When the camera was placed 1.92 m away from the phantom at the same height, the averaged measured step was 0.1902 mm, indicating that the tracking system had acceptable tracking performance with an averaged error of 9.8 × 10 −3 mm in this experimental setting. A histogram shown in Fig. 8 was  plotted, and a normal density function was then fitted to further demonstrate the distribution. From the distribution, 77.91% of the measured steps were located from 0.15 to 0.25 mm, with only three outliers less than 0.12 mm and three outliers larger than 0.3 mm. The variance of the tracked step was 0.0016, which further indicated a stable tracking process.

C. Imaging Test With a Four-Month Fetal Phantom
An imaging test with a four-month fetal phantom was conducted to demonstrate the volume reconstruction performance with enhanced US probe tracking using the designed hemispherical marker rigid body. The volume reconstruction in freehand 3-D US is achieved by specifying the position and orientation of each B-scan image [33] in the reconstruction volume. Thus, the reconstruction quality depends both on tracking accuracy and continuity of the collected data. Also, in terms of continuity of the collected data, tracking with 2-D marker rigid body can lose tracking of the trajectory up to a few seconds due to marker occlusion so that the generated frames cannot be inserted into the volume. Fig. 9(a) shows the fetal phantom that was used for the volume reconstruction. Comparing Fig. 9(b) with Fig. 9(a), the upper surface of the reconstructed phantom was clearly displayed. By rotating the reconstructed volume, the forehead, arms, and the femurs of the phantom could be viewed without mentally combining the frames to imagine the structure. This reconstruction verified that despite an averaged missing data percentage of 0.36%, the collected motion data were enough to reconstruct the target volume.
In addition, similar to the special case of collinear fiducials discussed in [34], many commercial marker rigid bodies with collinear fiducials cause failures in registration due to no unique solution. As discussed in this study, when the 2-D rigid body is perpendicular to the camera lens, the visible markers are collinear in the camera's view. Thus, a big portion of data (up to 41.54% in the lab tests) will be missing in such cases and causes the corresponding frames unable to insert in the reconstruction volume. Also, no unique solution to registration can be another serious issue on registration, reconstruction, and visualization.

V. CONCLUSION AND DISCUSSION
In this study, we constructed a printable hemispherical rigid body with 13 holes to accommodate passive retroreflective markers. This configuration allowed successful camera-based tracking of a 360 • rotation with little data missing, which improves both the tracking accuracy and (by a significant margin) the rotational range based on the test results. With markers attached to the rigid body and proper rod lengths connecting the markers to the rigid body, users can create an asymmetric and optimized distributed marker layout for their tracking purposes. A noncoplanar marker layout performs more robustly during the 3-D freehand US exam compared with the 2-D marker rigid body. In this study, on average, 99.76% of the rotational motion information was recorded during the tracking with the hemispherical marker rigid body, while only 54.01% on average was captured using the coplanar marker configuration. With data postprocessing, it is easier to fill the missing data provided when the missing data percentage is small, which occurs with the hemispherical marker rigid body. Future work includes reducing the size of the hemispherical rigid body so that the small markers are less likely to be hidden by the marker rigid body. An algorithm to automatically generate the marker layout is also vital to making the design more practical and user-friendly. Additional tests with freehand US operation should be conducted in clinical settings to verify the feasibility of this camera-based tracking system along with the designed marker rigid body.