Discrepancies between roughness measurements obtained with phase-shifting and white-light interferometry

Discrepancies between phase-shifting and white-light interferometry have been observed in step-height and surface roughness measurements. The discrepancies have a strong relation to the roughness average parameter of the surface. The skewing effect, which mainly occurs in the vicinity of peaks, valleys, and edges of the sample, causes this problem in white-light interferometry of step height. For roughness, two possible sources of the discrepancy are considered.


Introduction
A number of profiling techniques are capable of measuring surfaces with nanometer-scale step heights and subnanometer roughness. 1 These include stylusbased profiling, 2 phase-shifting interferometry 3 (PSI), white-light interferometry 4 -6 (WLI, also referred to as white-light vertical scanning interferometry 7,8 or coherence radar 9 ), Nomarski profiling, 10 and atomic force microscopy. 11 Among the various measurement methods, optical techniques offer the advantage of fast, noncontact area profiling of surfaces. WLI especially uses a short-coherence light source whose fringe visibility is narrowly localized so as to profile surfaces without 2 ambiguity, a limitation of the phase-shifting technique. Many other advantages of WLI are already well known. 4 -9,12-15 One of the most important advantages is that WLI is easy to combine with PSI, which can provide more accurate measurement with a precision as high as ͞1000. 7 Although WLI is a well-established technique for surface measurement area, it has exhibited some problems with step heights 8 and rough surfaces whose peak-valley values are less than the coherence length of the light source. Intuitively two techniques, PSI and WLI, should agree on measurements in their overlapping measuring range. However, discrepan-cies between these two techniques for a specific microscope have been observed with a particular group of specimens. In step-height measurement, localized spikes (batwings) observed at the edge of a surface feature are known as a source of discrepancy. 8 In practice, techniques in which both phase and coherence information are used have been proposed to correct the problem for step heights. 7,16 For roughness, discrepancies between PSI and WLI observed by careful experiments have not yet been reported to our knowledge.
To check and calibrate the instrument, in Section 2 we first test a series of standard step-height specimens whose range is from 8 to 1015 nm using both PSI and WLI. In Section 3 we then test standard periodic gratings and random roughness specimens whose roughness average Ra (Ref. 17) values range from 3 to 500 nm. Within this range, the discrepancy has a strong relation to the surface roughness parameter Ra. In Section 4 we consider two possible sources of the discrepancies observed. The first is a diffraction model based on Harasaki and Wyant's work. 8 The second is a qualitative observation about the diffracted light in the microscope. Our experimental data were obtained with Mirau-type interferometers operating with either a 20ϫ [numerical aperture (NA) of approximately 0.4, NA correction factor ϳ1.027] or a 50ϫ (NA of approximately 0.55, NA correction factor ϳ1.074) objective as shown in Table  1. However, Michelson and Linnik types may also show the same problems.

Step Height Check
The optical instrument we used has a 736 ϫ 480 CCD camera that provides 480 profiles and 736 pixels in each profile. To check and calibrate the instrument, we tested a series of standard step-height specimens fabricated on 5 mm ϫ 5 mm silicon substrates and overcoated with a layer of opaque chromium approximately 100 nm in thickness. The step heights were calibrated by interferometric techniques.
Much effort has been devoted to improving coherence peak detection algorithms for WLI during the past decade. 18 -20 In addition to these efforts, phase and coherence combining techniques 7,16 were suggested to correct the skewing spike errors at the edges of a step discontinuity whose height is less than the coherence length. However, for the user to determine a step height, masking off the edge area can be an effective and sufficient method for blocking out the skewing effect. To calculate the step heights, we used a modification 21 of a two-sided algorithm used in our step calibrations. 22 We averaged step heights calculated from 418 to 480 measured profiles to reduce the noise. Some bad profiles contaminated by dust were not included in the calculation. The reduction of the noise by averaging is especially important for the smallest step ͑8 nm͒ measurement from profiles obtained using PSI and WLI. The resulting step-height values are in good agreement as shown in Table 2. Figure 1 is a graph of the calculated heights with PSI and WLI of the seven specimens under the modified two-sided algorithm. The data points are offset in the x direction from each other for easy viewing. Error bars equal to Ϯ1 standard deviation show the overlap of the values. The WLI profile measurements are much noisier than the PSI measurements as shown by the standard deviations. Nevertheless, the calculated step heights from the two interferometers are very close. Specimens 3 and 5 show slight differences with this algorithm. This is due to a smaller step width of approximately 25 m, less than 60 pixels per profile with 20ϫ objective on these than on the rest of the specimens (approximately 100 m, slightly more than 245 pixels per profile with 20ϫ). Since the step is not as wide, there are fewer points used to determine the least-squares lines, thus resulting in slightly skewed lines. We conclude from these results that the z-scale calibration is not the source of the differences we observe for roughness measurements, discussed below.

Roughness Measurement
When the user measures a periodic grating or a random specimen using optical techniques, many error sources may affect the measured result. Figure 2 shows the profile discrepancy between the PSI and the WLI readings. The test sample Rubert 23,24 529 sinusoidal grating has 0.1 m Ra, 10 m surface spatial wavelength, and a 330 nm peak-valley value. From Fig. 2(a), the PSI result seems to correspond to the expected profile, whereas the WLI result shows noiselike spikes on the top and bottom of the grating. This effect causes shape distortion and discrepancies in the Ra value between PSI and WLI. Nevertheless, the periods of the grating measured by both optical techniques are in good agreement with each other. Figures 3(a) and 3(b) show another set of results for a Rubert 528 sinusoidal grating that has 0.5 m Ra, 50 m spatial wavelength, and a 1.6 m peak-valley value. From Fig. 3, the discrepancy between the two interferometry modes seems to be less serious than for the 529 grating. It is clear that the discrepancy has a relation to the grating's specification such as the peak-valley value, the period, and the Ra value. Because we do not have enough observations to distinguish the spatial wavelength effects from the amplitude effects, we take the Ra value as a measure of the surfaces exhibiting this phenomenon. The Ra represents the surface roughness amplitude and is also useful when we measure a random surface that has no specified period and amplitude. Other roughness amplitude parameters, such as rms roughness, are also useful measures. Figure 4 is a typical example. The two measured random profiles from both interferometers are similar but with some differences. We need a method to quantify these differences. The cross-correlation function can be a measurand to quantify a difference between two profiles. However, it is just a function defined by two profiles, i.e., a comparative tool rather than a parameter that can represent a surface.
We have combined our roughness test results in Fig. 5 and Table 3. From Fig. 5 we can clearly see that the difference between PSI and WLI has a strong relationship to the surface roughness parameter Ra in the range from zero to several hundred nanometers. We started with a smooth mirror that has 3 nm Ra as measured with the WLI. With our baseline point, the smooth mirror can be considered as a random surface whose Ra value is extremely small. Then we tested four one-dimensional random specimens (Rubert 501, 502, 503, 504), four sinusoidal gratings (sample 3, Rubert 529, SRM2071, Rubert 528), and one periodic specimen having a cusp shape profile (No. 00635) with various Ra. We did not obtain a measurement for Ra over 500 nm because this is the largest roughness sample among our specimen list. In addition, PSI is expected to become less accurate as Ra increases above ϳ150 nm. Before each test, we calibrated the instrument with a step-height standard whose step-height value is well matched to the amplitude of the measuring sample to minimize any nonlinearity issues in WLI. Whether the sample is periodic or random, discrepancies have a peak within the 100-200 nm Ra range and decrease outside of this range. The shape of the sample does not Fig. 1. Graph of the calculated heights with PSI and WLI of the seven specimens calculated with a two-sided algorithm. The numbers along the x axis are the step-height specimen numbers shown in Table 2. Sd, standard deviation.   seem to be an important factor. Noiselike spikes occur at the top and bottom of the cusp shape specimen in the same manner as the sinusoidal grating, and the size of the Ra discrepancy is similar to those for the sinusoidal profile specimens as shown in Fig. 5. The root-mean-square slope, 17 R⌬q, of each specimen was calculated and is shown in the last column of Table 3 to check the surface slope effect in the WLI reading. We conclude that a slope effect is not a dominant factor of the discrepancy within the Ra range between 0 and 500 nm. For example, the Rubert 529 and 528 sinusoidal gratings have almost same R⌬q value, as shown in Table 3, but show a significantly different discrepancy in Fig. 5. Figure 2 gives us a clearer clue. The profile difference between PSI and WLI occurs on the minimum slope area (top and bottom positions of the grating) where the slope effect should be small. We believe that the noiselike spikes in the profile have a predominant role in the discrepancies plotted on Fig. 5.
We also tested the sample with one other WLI instrument to check whether this phenomenon depends on the particular instrument. There are offsets between the data from the original (instrument 1) and the new instrument (instrument 2). However, the pattern is similar to the original one, as shown in Fig. 6. The relation between the discrepancy and the Ra value therefore is not related to a specific WLI instrument.
Last, we compared the readings with the calibrated stylus results. Figures 7(a) and 7(b) illustrate the PSI and the WLI deviation from the stylus readings. Even though the stylus is not a perfect instrument, it is useful as a standard technique and the PSI measurements are generally in good agreement with the stylus measurements. For the stylus results, the tip radius was 1.7 m ϫ 0.2 m (nominal 2 m), digitization intervals were 0.01 m (vertical) and 0.25 m (horizontal), and the nominal stylus loading was ϳ1 mN. The specimens were measured at nine posi-tions using a Gaussian filter (0.25 mm long wavelength cut off, 1.25 m short wavelength cut off) with 1.25 mm evaluation length (the actual traversing length is longer than 1.5 mm). Each stylus Ra value is averaged from nine profiles. From Fig. 7, the deviation between them is less than or equal to 6 nm in Ra value except for the Rubert 504 specimen. This specimen has surface slopes that may be high enough to cause inaccuracy in the measured roughness topography by PSI. So we are convinced that the characteristic illustrated in Figs. 5 and 6 is due to the WLI, because the PSI readings seem to have no special relation to the Ra value in Fig. 7(a).  Table 3.   The relative uncertainties due to position variation are approximately less than 7% Slope measurements of surface profiles depend sensitively on the sampling interval used. Therefore absolute uncertainties are difficult to assess. Because all the R⌬q values were calculated with the same sampling interval, the relative uncertainties due to position variation are approximately less than 5% ͑k ϭ 1͒.

Diffraction Models
We present a model to describe the WLI spikes that mainly occur near peaks, valleys, and edges of a sample.
In the model we assumed a normal-incidence plane wave on a sinusoidal grating surface as illustrated in Figs. 8 and 9. We also assumed that the Fresnel approximation was not valid in our case because our interest was focused on a submicrometer level surface variation. For simplicity we illustrate the model in one dimension only, which means that the sample varies in the x direction but not the y direction. This aspect of the model is consistent with the samples we used. The objective lens collects not only the reflected and diffracted beam from the ideal imaging point but also the neighboring light from the vicinity within a 1.22͞NA diameter. In addition, the CCD camera has a finite pixel size, approximately 9.8 m ϫ 8.4 m, so that the measured intensity is obtained from the sum of incident light in a pixel as shown in Figs. 8(a) and 8(b). This neighboring light might influence the spikes in a WLI measurement. Particularly when the surface variation is less than the depth of focus of the objective lens and less than the coherence length, the interference among these beams will be more apparent.
We can express the diffracted wave at a z plane as 25 where Here u i ͑x 0 , z 0 ͒ is a input wave in the z 0 plane at the location x 0 , u p ͑x Ϫ x 0 , z Ϫ z 0 ͒ denotes a pinhole diffraction wave, k is the wave number, and r is the distance between two arbitrary points on the z plane and the surface z 0 . We can reexpress Eq. (1) as where FT Ϫ1 is the inverse Fourier transformation, and is the x-directional spatial frequency of the light. Using Eqs. (1) and (2), U͑, z͒ is expressible in the form where U i ϭ FT͓u i ͔.
From a physical sense, we do not need to consider the negative z direction and the evanescent field. Therefore z is always positive and ͑1͞ 2 Ϫ 2 ͒ Ͼ 0, i.e., Ϫ1͞ Ͻ Ͻ 1͞. From the physical model and Eq. (3), the Fourier-transformed test arm light yields   (4) where U i ϭ FT͓exp͑Ϫikz 0 ͔͒.
In Eq. (4), u test is the test arm light, and H͑x͒ represents the distribution function of the collected light intensity within S. For example, the central light intensity in S will have a maximum value, whereas the neighboring intensity from the vicinity will be smaller than the central light. We assumed H͑x͒ to be a normalized Gaussian function, which means that the central light has more weight. For illustration, z 0 is assumed to be a sinusoidal function as shown in Fig. 9. S is the size of the light collected by a single pixel, which is determined by the CCD pixel size and an Airy disk diameter and can be expressed as where P is the size of one CCD pixel, and M is the magnification factor of the objective as shown in Figs. 8(a) and 8(b). From Eqs. (2) and (4) and the condition of Ϫ1͞ Ͻ Ͻ 1͞, the test arm light yields Finally, following the approach of Harasaki and Wyant,8 the measured intensity at the CCD camera is where 1 and 2 are the wavelengths at both ends of the spectrum of the light source, u reference is the light from the reference arm, and F͑͒ denotes the spectral distribution of the light centered at 0 . Here we set F͑͒ as 1 between 1 and 2 for convenience. The reference light u reference was also set as a plane wave within the measuring area. From Eqs. (6) and (7), we calculated the intensity numerically. We used 1 ϭ 550 nm, 2 ϭ 660 nm, the center wavelength 0 ϭ 600 nm, and the vertical sampling distance ⌬z ϭ 80 nm because a real WLI has that specification. To numerically integrate Eq.
(4), we took 1000 data points in the x direction. The test sample was a virtual sinusoidal surface profile with a 330 nm peak-valley value, 10 m surface spatial wavelength, and ϳ105.04 nm in Ra value as shown in Fig. 10. It is the same as the Rubert 529 grating that yields the largest discrepancy among the tested sinusoidal gratings. We also applied the centroid algorithm to calculate the best focus. 8 After simulation, a 398.16 nm peak-valley and 121.2 nm Ra value was calculated with the WLI algorithm as illustrated in Fig. 10. We can also see the spikes at the top and bottom and relatively small distortion elsewhere. At the side of the sine shape, the vicinity light also exerts an influence on the interference pattern. However, its effect appears symmetrical in the interference pattern so that the envelope peak of the pattern is not affected. One thing to be noted is that the simulation in Fig. 10 is not a perfect match for the measurement result of Fig. 2(b). It may be that the one-dimensional simulation conditions we assumed are not perfect or that the error arises from a different source. There are many other possible error sources such as nonlinearity or slope effects. A vector field propagation approach such as the rigorous coupled-wave analysis 26,27 would provide a more accurate simulation than ours and might explain the phenomenon. Furthermore, we may need an extension of our model that includes nonlinearity, NA effects, light scattering, 9 and the slope of the surface feature.
We suggested another issue that may have an important influence on the discrepancy. Figure 11(a) illustrates that broadly scattered light from a rough surface would minimize the spikes, whereas a smooth mirror surface produces a reflection along the optic axis with minimal wave-front distortion as shown in Fig. 11(b). However, for a moderately rough surface with Ra from 50 to 150 nm as illustrated in Fig. 11(c), the combination of specular and diffuse light might produce an unexpected effect on a roughness. Fig. 10. Simulated WLI readings with a virtual sine shape. The virtual sample was generated with a 330 nm peak-valley value and 10 m surface spatial wavelength.

Conclusions
Although WLI is a well-established technique for surface measurement, it has been shown to be susceptible to a skewing effect for step height or surface roughness whose peak-valley value is less than the coherence length of the light source. To confirm the problem, we first tested a series of standard stepheight specimens whose range is from 8 to 100 nm using both PSI and WLI. In Section 2 we arrived at good results with a simple masking algorithm that can block the skewing spikes at the edge. We also tested standard periodic gratings and random roughness specimens whose Ra value is from 3 to 500 nm. Within this range, the discrepancy between PSI and WLI shows a clear dependence on the surface roughness parameter Ra. This phenomenon is especially prominent between 50 and 300 nm in Ra value. It seems to be unrelated to the specific instrument, profile shape, and randomness. We examine a diffraction model in Section 4, which gave us a result similar to the experimental result but of much smaller magnitude. We also make a qualitative observation about the diffraction field from surfaces with roughness between ϳ50 and 150 nm Ra. A possible approach to remove the error we observe for roughness measurement is to apply phase and coherence information combining recent techniques described earlier for step-height measurement. 7,16 However, those would likely only reduce the distortion shown in our model in Section 4. The actual discrepancies seem significantly larger than this and likely arise from more complicated error sources such as the scattering effect, slope of the surface feature, NA effect, and nonlinearity.