A Survey on Dynamic Analysis of the Costas Loop

This survey is devoted to the dynamic analysis of the Costas loop. In particular the acquisition process is analyzed in great detail. Acquision is most conventiently described by a number of frequency and time parameters such as lock-in range, lock-in time, pull-in range, pull-in time, and hold-in range. While for the classical PLL equations for all these parameters have been derived (many of them are approximations, some even crude approximations), this has not yet been carried out for the Costas loop. It is the aim of this analysis to close this gap. The paper starts with an overview on mathematical and physical models (exact and simplified) of the different variants of the Costas loop, cf. Section~1. In Sections 2--5 equations for the above mentioned key parameters are derived. Finally, the hold-in range of the Costas loop for the case where a lead-lag filter is used for the loop filter is analyzed, cf. Appendix.

Dynamic behavior of the PLL and the Costas loop has been described extensively in the literature [3,6,7,[13][14][15][16][17][18][19][20][21][22][23], and a number of key parameters has been defined that describe its lock-in and lock-out characteristics. When the PLL is initially out of lock, two different types of acquisition processes can occur, either the so-called lock-in process or the so-called pull-in process. The first of those is a fast process, i.e. the acquisition takes place within at most one beat note of the difference between reference frequency ω 1 and initial VCO (Voltage Controlled Oscillator) frequency ω 2 , cf. Figure 1 for signal denotations 3 . The frequency difference for which such a fast acquisition process takes place corresponds to the lock-in range ∆ω L , and the duration of the locking process is called lock time T L . When the difference between reference and VCO frequency is larger than the lock-in range but less than the pull-in range ∆ω P , a slow acquisition process occurs. The time required to get acquisition is called pull-in time T P . In case of the PLL all these acquisition parameters can be approximated by characteristic parameters of the PLL, i.e. from natural frequency ω n and damping factor ζ.
sin(ω 1 t) cos(ω 2 t) Loop filter Because the systems considered are highly non linear, exact computation of such parameters is very difficult or even impossible. Therefore it is necessary to introduce a number of simplifications. This implies that the obtained results are only approximations, in some cases rather crude approximations.
As will be shown in the following sections there are different types of Costas loops. The very first of these loops has been described by J. Costas in 1956 [26] and was primarily used to demodulate amplitudemodulated signals with suppressed carrier (DSB-AM). The same circuit was used later for the demodulation of BPSK signals (binary phase shift keying) [10]. With the advent of QPSK (quadrature phase shift keying) this Costas loop was extended to demodulate QPSK signals as well. These two types of Costas loop operated with real signals. In case of BPSK, the input signal u 1 (t) is a sine carrier that was phase modulated by a binary signal, i.e.
where ω 1 is the (radian) carrier frequency, and m 1 (t) can have two different values, either +1 or −1, or two arbitrary equal and opposite values +c and −c, where c can be any value. In case of QPSK, two quadrature carriers are modulated by two modulating signals, i.e. u 1 (t) = m 1 (t) cos(ω 1 t) + m 2 (t) sin(ω 1 t), where m 1 and m 2 can both have two equal and opposite values +c and −c. It is obvious that in both cases the input signal is a real quantity. In the following these two types of Costas loop will be referred to as "conventional Costas loops".
Much later, Costas loops have been developed that operate not on real signals, but on pre-envelope signals [27]. These types of Costas loops will be referred to as "modified Costas loop" in the following sections. The block diagram shown in Figure 2 explains how the pre-envelope signal is obtained. The real input signal u 1 (t) is applied to the input of a Hilbert transformer [2], [5]. The output of the Hilbert transformerû 1 (t) is considered to be the imaginary part of the pre-envelope signal, i.e the pre-envelope signal is obtained from u + 1 (t) = u 1 (t) + jû 1 (t).
The Costas loops operating with pre-envelope signals will be referred to as "modified Costas loops", cf. Because there are different types of Costas loops the acquisition parameters must be derived separately for each of these types. This will be performed in the following sections. In order to see how good or bad the obtained approximations, we will develop Simulink models for different types of Costas loops and compare the results of the simulation with those predicted by theory.

BPSK Costas loop
The operation of the Costas loop is considered first in the locked state with zero phase difference (see By (1) the input signal u 1 (t) is the product of a transferred binary data and the harmonic carrier sin(ωt) with a high frequency ω. Since the Costas loop is considered to be locked, the VCO orthogonal output signals are synchronized with the carrier (i.e. there is no phase difference between these signals). The input signal is multiplied (multiplier block (⊗)) by the corresponding VCO signal on the upper branch and by the VCO signal, shifted by 90 • , on the lower branch. Therefore on the multipliers' outputs one has I 1 (t) = m 1 (t) − m 1 (t) cos(2ωt), Q 1 (t) = m 1 (t) sin(2ωt).
Consider the low-pass filters (LPF) operation. Assumption 1. Signals components, whose frequency is about twice the carrier frequency, do not affect the synchronization of the loop (since they are suppressed by the low-pass filters).
Assumption 2. Initial states of the low-pass filters do not affect the synchronization of the loop (since for the properly designed filters, the impact of filter's initial state on its output decays exponentially with time).
Assumption 3. The data signal m 1 (t) does not affect the synchronization of the loop.
Assumptions 1,2, and 3 together lead to the concept of so-called ideal low-pass filter, which completely eliminates all frequencies above the cutoff frequency (Assumption 1) while passing those below unchanged (Assumptions 2,3). In the classic engineering theory of the Costas loop it is assumed that the low-pass filters LPF are ideal low-pass filters 4 . Figure 3 the loop is in lock, i.e. the transient process is over and the synchronization is achieved, by Assumptions 1,2, and 3 for the outputs I 2 (t) and Q 2 (t) of the low-pass filters LPF one has 4 Note that Assmptions 1-3 may not be valid and require rigorous justification [28,29] I 2 (t) = m 1 (t), Q 2 (t) = 0. Thus, the upper branch works as a demodulator and the lower branch works as a phase-locked loop.

Since in
Since after a transient process there is no phase difference, a control signal at the input of VCO, which is used for VCO frequency adjustment to the frequency of input carrier signal, has to be zero: u d (t) = 0. In the general case when the carrier frequency ω and a free-running frequency ω f ree of the VCO are different, after a transient processes the control signal at the input of VCO has to be non-zero constant: u d (t) = const, and a constant phase difference θ e may remain.
By Assumptions 2 and 3 the low-pass filters outputs can be approximated as Since m 2 1 (t) ≡ 1, the input of the loop filter (LF) is Such an approximation is called a phase detector characteristic of the Costas loop.
Since an ideal low-pass filter is hardly realized, its use in the mathematical analysis requires additional justification. Thus, the impact of the low-pass filters on the lock acquisition process must be studied rigorously.
The relation between the input u d (t) and the output u f (t) of the loop filter has the forṁ where A is a constant matrix, the vector x(t) is the loop filter state, b, c are constant vectors, h is a number.
The filter transfer function has the form: The control signal u f (t) is used to adjust the VCO frequency to the frequency of the input carrier signal Here ω f ree is the free-running frequency of the VCO and K 0 is the VCO gain. The solution of (6) with initial data x(0) (the loop filter output for the initial state x(0)) is as follows where γ(t − τ ) = c * e A(t−τ ) b + h is the impulse response of the loop filter and α 0 (t, x(0)) = c * e At x(0) is the zero input response of the loop filter, i.e. when the input of the loop filter is zero.
Assumption 4 (analog of Assumption 2). Zero input response of loop filter α 0 (t, x(0)) does not affect the synchronization of the loop (one of the reasons is that α 0 (t, x(0)) is an exponentially damped function for a stable matrix A).
Consider a constant frequency of the input carrier: and introduce notation Then Assumption 4 allows one to obtain the classic mathematical model of PLL-based circuit in signal's phase space (see Figure 5): For the locked state a linear PLL model can be derived, which is shown in Figure 6. This model is useful for approximation of hold-in range.
H VCO (s) Figure 6: Linear model of Costas loop In the locked state both reference and VCO frequencies are approximately the same, hence the input of the lowpass filter is a very low frequency signal. Therefore the lowpass filter can be ignored when setting up the linear model of the Costas loop. The linear model is made up of three blocks, the phase detector PD, the loop filter LF and the VCO. In digital Costas loops the VCO is replaced by a DCO (digital controlled oscillator). This will be discussed in later sections. For these building blocks the transfer functions are now defined as follows.
Phase detector (PD). In the locked state, the phase error θ e is very small so by (5) we can write with K d called phase detector gain.
Note that the uppercase symbols are Laplace transforms of the corresponding lower case signals.
Loop filter (LF). For the loop filter we choose a PI (proportional + integral) filter whose transfer function has the from This filter type is the preferred one because it offers superior performance compared with lead-lag or lag filters.
VCO. The transfer function of the VCO is given by where K 0 is called VCO gain.
Consider another non linear model of Costas loop in Figure 7 (delay model).
2sin(θ 2 (t)) Here we use Assumtions 1-3 (initial states of filters are omitted, double-frequency terms are completely filtered by LPFs, and m 1 (t) doesn't affect synchronization) and filters LPFs are replaced by the corresponding phase-delay blocks ϕ 1 (θ e (t)) = ϕ 1 (∆ω(t)). Outputs of low-pass filters are I 2 (t) = cos(θ e (t) + ϕ 1 (θ e (t))), where Then after multiplication of I 2 (t) and Q 2 (t) we have u d (t) = I 2 (t)Q 2 (t) = 1 2 sin(2θ e (t) + 2ϕ 1 (θ e (t))) (19) and the output u f (t) of the loop filter (15) satisfies the following equationṡ Equations of Costas loop in this case arė , For LPF transfer functions phase shift is equal to ϕ 1 (θ e ) = − arctan(θ e /ω 3 ). Therefore (21) is equal to the following systeṁ where . Equation (23) is hard to analyze both numerically and analytically, however this model is still useful. In the following discussion it is used to approximate pull-in range and pull-in time. For this purpose we need to simplify delay model shown in Figure 7. Consider block diagram in Figure 8. The lowpass filters (LPF) used in both I an Q branches are assumed to be first order filters having transfer function (15). As will be demonstrated later the corner frequency of these filters must be chosen such that the data signal I is recovered with sufficient accuracy, i.e. the corner frequency ω 3 must be larger than the symbol rate. Typically it is chosen twice the symbol rate, i.e. f 3 = 2f S with f S = symbol rate and f 3 = ω 3 /2π. The output signal I 1 of the multiplier in the I branch consists of two terms, one having the sum frequency ω 1 + ω 2 and one having the difference frequency ω 1 − ω 2 . Because the sum frequency term will be suppressed by the lowpass filter, only the difference term is considered. The same holds true for signal Q 1 in the Q branch. It will show up that the range of difference frequencies is markedly below the corner frequency ω 3 of the lowpass filter. Hence the filter gain will be nearly 1 for the frequencies of interest. As will also be shown later the phase at frequency ∆ω = ω 1 − ω 2 cannot be neglected. The lowpass filter is therefore represented as a delay block whose transfer function has the value exp(jϕ 1 ), where ϕ 1 is the phase at frequency ∆ω. The delayed signals I 2 and Q 2 are now multiplied by the product block at the right in the block diagram. Consequently the output signal u d (t) of this block will have a frequency of 2∆ω.
This signal is now applied to the input of the loop filter LF. Its transfer function has been defined in (15).
The corner frequency of this filter is ω C = 1/τ 2 . Because the phase of the loop filter cannot be neglected, it is represented as a delay block characterized by where ϕ 2 is the phase of the loop filter at frequency 2∆ω.
The analysis of dynamic behavior becomes easier when the order of some blocks in Figure 8 is reversed (see Figure 9), i.e. when we put the multiplying block before the lowpass filter. Because the frequency of signal u d (t) in Figure 8 is twice the frequency of the signals I 2 and Q 2 , the phase shift created by the lowpass filter at frequency 2∆ω is now twice the phase shift at frequency ∆ω.
The LPF is therefore represented here by a delay block having transfer function exp(2jϕ 1 ).
We can simplify the block diagram even more by concatenating the lowpass filter and loop filter blocks.
The resulting block delays the phase by ϕ tot = 2ϕ 1 + ϕ 2 . This is shown in Figure 10. The output signal u f (t) of this delay block now modulates the frequency generated by the VCO. To compute pull-in time we need to consider Costas loop model in Figure 5 with averaged signals of phase detector output u d and filter output u f (see Figure 11). Figure  The model is built from three blocks. The first of these is labeled "phase-frequency detector". We have seen that in the locked state the output of the phase detector depends on the phase error θ e . In the unlocked state, however, the average phase detector output signal u d is a function of frequency difference as will be shown in next section (Eq. (68)), hence it is justified to call that block "phase-frequency detector". As we will recognize the pull-in process is a slow one, i.e. its frequency spectrum contains low frequencies only that are below the corner frequency ω C of the loop filter, cf. Eq. (15). The loop filter can therefore be modeled as a simple integrator with transfer function

LF PD
Therefore The frequency ω 2 of the VCO output signal is defined as where ω f ree is the free running frequency and K 0 is the VCO gain. Now we define the instantaneous frequency difference ∆ω as Substituting (11) and (28) into (27) finally yields
On the lower branch the output signal of VCO is multiplied by the input signal: Here from an engineering point of view, the high-frequency terms cos(2ωt) and sin(2ωt) are removed by ideal low-pass filters LPFs (see Assumption 1 in previews section). In this case, the signals I 2 (t) and Q 2 (t) on the upper and lower branches can be approximated as Apart from considered case there are two possible cases: 1) the frequencies are different or 2) the frequencies are the same but there is a constant phase difference. Consider Costas loop before synchronization (see Figure 13) 2sin(θ 2 (t)) LPF LPF u d (t) ≈ φ(θ e (t)) sgn sgn I 1 (t) 2cos(θ 2 (t)) Loop filter + -Σ Figure 13: QPSK Costas loop is out of lock, there is non zero phase difference.
in the case when the phase of the input carrier θ 1 (t) and the phase of VCO θ 2 (t) are different: In this case, using Assumption 1, the signals I 2 (t) and Q 2 (t) on the upper and lower branches can be approximated as I 2 (t) ≈ m 1 (t) cos(θ e (t)) + m 2 (t) sin(θ e (t)), Q 2 (t) ≈ −m 1 (t) sin(θ e (t)) + m 2 (t) cos(θ e (t)).
After the filtration, both signals, I 1 (t) and Q 1 (t), pass through the limiters (sgn blocks). Then the outputs of the limiters sign I 2 (t) and sign Q 2 (t) are multiplied with Q 2 (t) and I 2 (t), respectively. By Assumption 2 and corresponding formula (32), the difference of these signals can be approximated as with m = |m 1 | = |m 2 |. Here ϕ(θ e (t)) is a piecewise-smooth function 5 shown in Figure 14. The resulting signal ϕ(t), after the filtration by the loop filter, forms the control signal u f (t) for the

VCO.
To derive mathematical model in the signal space describing physical model of QPSK Costas loop one takes into account (6) and (8): However equations (35) are nonlinear and non autonomous with discontinuous right-hand side, which are extremely hard to investigate. Therefore, the study of (35) is outside of the scope of this work.
To derive linear model, we consider (34) and the corresponding Figure 14. The curve looks like a "chopped" sine wave. The Costas loop can get locked at four different values of θ e , i.e. with θ e = 0, π/2, π, or 3π/2. To simplify the following analysis, we can define the phase error to be zero wherever the loop gets locked. Moreover, in the locked state the phase error is small, so we can write i.e. the output signal of the adder block at the right of Figure 13 is considered to be the phase detector output signal u d . The phase detector gain is then It is easily seen that the linear model for the locked state is identical with that of the Costas loop for BPSK, cf. Figure 6. Because only small frequency differences are considered here, the lowpass filters can be discarded. The transfer functions of the loop filter and of the VCO are assumed to be the same as in the case of the Costas loop for BPSK, hence these are given by Eqs. (15) and (16).

Similar to BPSK Costas loop, it is reasonable to consider delay model of QPSK Costas loop (see Fig-
2cos(θ 2 (t)) Then u d (t) can be approximated as Consider the loop filter transfer function (15). Equations of delay model of QPSK Costas loop in this case , The non linear model of the Costas loop for QPSK is developed on the basis of the non linear model we derived for the Costas loop for BPSK, cf. Figure 10. Here again the order of lowpass filters and the blocks shown at the right of Figure 12 is reversed. This results in the model shown in Figure 16a. Figure  Q 1 are passed through lowpass filters. As in the case of the Costas loop for BPSK we assume here again that the difference frequency ∆ω is well below the corner frequency ω 3 of the lowpass filters, hence the gain of the lowpass filters is nearly 1 at ω = ∆ω. Because the phase shift must not be neglected, we represent the lowpass filter by a delay, i.e. its frequency response at ω = ∆ω is where ϕ 1 is the phase of the lowpass filter. Due to the arithmetic operations in block "B" (cf. Figure 16) the frequency of the u d is quadrupled, which implies that the phase shift at frequency 4∆ω becomes 4ϕ 1 .
The frequency response of the loop filter at ω = 4∆ω is given by The block diagram of the modified Costas loop for BPSK is shown in Figure 17. The input signal is given by where θ 1 is initial phase. The input signal is first converted into a pre-envelope signal, as explained in section 1. The output signal of the Hilbert transformer iŝ Note that because the largest frequency of the spectrum of the data signal m 1 (t) is much lower than the carrier frequency ω 1 , the Hilbert transform of the product H[m 1 (t) cos(ω 1 t+θ 1 )] equals m 1 (t)H[cos(ω 1 t+θ 1 )] [5]. The pre-envelope signal is obtained now from The exponential in Eqn. (41) is referred to as a "complex carrier". In Figure 17 complex signals are shown as double lines. The solid line represents the real part, the dotted line represents the imaginary part.
To demodulate the BPSK signal, the pre-envelope signal is now multiplied with the output signal of the VCO, which is here a complex carrier as well. The complex output signal of the VCO is defined as In the locked state of the Costas loop both frequencies ω 1 and ω 2 are equal, and we also have θ 1 ≈ θ 2 .
Hence the output signal of the multiplier M 1 is i.e. the output of the multiplier is the demodulated data signal m 1 (t). To derive the linear model of this Costas loop, it is assumed that ω 1 = ω 2 and θ 1 = θ 2 . The output signal of multiplier M 1 then becomes This is a phasor having magnitude |m 1 (t)| and phase θ 1 − θ 2 , as shown in Figure 18. Two quantities are determined from the phase of phasor u m (t), i.e. the demodulated data signal I and the phase error θ e . The data signal is defined as i.e. when the phasor lies in quadrants I or IV, the data signal is considered to be +1, and when the phasor is in quadrants II or III, the data signal is considered to be -1. This means that I can be either a phasor with phase 0 or a phasor with phase π.
These two phasors are plotted as thick lines in Figure 18.
The phase error θ e is now given by the difference of the phases of phasor u m (t) and phasor I, as shown in figure 18, i.e. θ e is determined from The product u m (t)I is computed by multiplier M 2 in Figure 18. The block labeled "Complex → mag, phase" is used to convert the complex signal delivered by M 2 into magnitude and phase. The magnitude is not used in this case, but only the phase. It follows from Eqn. (46) that the phase output of this block is the phase error θ e , hence the blocks M 1 , M 2 , sgn, and Complex → mag, phase represent a phase detector with gain K d = 1. The phase output of block Complex → mag, phase is therefore labeled u d . Figure 6 shows the complete linear model of the modified Costas loop for BPSK. The transfer functions of the loop filter and VCO have been defined in Eqs. (15) and (16). Note that with this type of Costas loop there is no additional lowpass filter, because the multiplication of the two complex carriers (cf. Eqn. (43)) does not create the unwanted double frequency component as found with the conventional Costas loops. Figure 19 shows the block diagram of the modified Costas loop for QPSK. The reference signal u 1 (t) is defined by

Modified Costas loop for QPSK
The Hilbert transformed signal is then given bŷ and the pre-envelope signal then becomes This can be rewritten as Herein the term (m 1 (t) + jm 2 (t)) is complex envelope, and the term exp(jω 1 t + θ 1 ) is complex carrier. The VCO generates another complex carrier given by (42). The multiplier M 1 creates signal u m (t) that is given by When the loop has acquired lock, ω 1 = ω 2 , and θ 1 ≈ θ 2 , so we have I and Q are taken from the output of sgn blocks, cf. Figure 19. The phase error is obtained from where I − jQ is the conjugate of the complex envelope. Multiplier M 2 delivers the product u m (t)(I − jQ), and the block "Complex → mag, phase" is used to compute the phase of that complex quantity. Note that the magnitude is not required. The blocks M 1 , sgn, Inverter, M 2 , and Complex → mag, phase form a phase detector having gain K d = 1. The phase output of block Complex → mag, phase is therefore labeled u d .

Definitions of hold-in range, lock-in range, pull-in range.
In the classic books on phase-locked loops [30][31][32] such concepts as hold-in pull-in lock-in and other frequency ranges for which PLL can achieve lock were introduced. Usually in engineering literature nonrigorous definitions are given for these concepts. In the following we introduce definitions, based on rigorous discussion in [33,34].
Definition of hold-in range. The largest interval [0, ∆ω h ) of frequency deviations |∆ω 0 |, such that the loop re-achieves locked state after small perturbations of the filters' state, the phases and frequencies of VCO, and the input signals, is called a hold-in range. This effect is also called steady-state stability. In addition, for a frequency deviation within the hold-in range, the loop in a locked state tracks small changes in input frequency, i.e. achieves a new locked state (tracking process) [33,34].
Assume that the loop power supply is initially switched off and then at t = 0 the power is switched on, and assume that the initial frequency difference is sufficiently large. The loop may not lock within one beat note, but the VCO frequency will be slowly tuned toward the reference frequency (acquisition process). This effect is also called a transient stability. The pull-in range is used to name such frequency deviations that make the acquisition process possible.
Definition of pull-in range. The largest interval [0, ∆ω P ) of frequency deviations |∆ω 0 |, such that the loop achieves locked state for any initial states (filters and initial phase of VCO), is called a pull-in range [33,34]. The largest frequency deviation ∆ω P is called a pull-in frequency [33,34].
Definition of lock-in range. Lock-in range is a largest interval of frequency deviations |∆ω 0 | ∈ [0, ∆ω L ) inside pull-in range, such that after an abrupt change of ω 1 within a lock-in range the PLL reacquires lock without cycle slipping, if it is not interrupted. Here ∆ω L is called a lock-in frequency [33,34] 6 .
which is in agreement with the classical consideration (see, e.g. [ Figure 6). By (13), (14), and (15)   ω C , which is defined by ω C = 1/τ 2 , and gain parameters K d and K 0 . At lower frequencies the magnitude rolls off with a slope of -40 dB/decade. At frequency ω C the zero of the loop filter causes the magnitude to change its slope to -20 dB/decade. To get a stable system, the magnitude curve should cut the 0 dB line with a slope that is markedly less than -40 dB/decade. Setting the parameters such that the gain is just 0 dB at frequency ω C provides a phase margin of 45 degrees, which assures stability [2]. From the open loop transfer function we now can calculate the closed loop transfer function defined by After some mathematical manipulations we get It is customary to represent this transfer function in normalized form, i.e.
with the substitutions where ω n is called natural frequency and ζ is called damping factor. The linear model enables us to derive simple approximations for lock-in range ∆ω L and lock time T L .
For the following analysis we assume that the loop is initially out of lock. The frequency of the input signal ( Figure 4) is ω 1 , and the frequency of the VCO is ω 2 . The multiplier in the I branch therefore generates an output signal consisting of a sum frequency term ω 1 + ω 2 and a difference frequency term ω 1 − ω 2 . The sum frequency term is removed by the lowpass filter, and the frequency of the difference term is assumed to be much below the corner frequency ω 3 of the lowpass filter, hence the action of this filter can be neglected for this case. Under this condition the phase detector output signal u d (t) will have the form (cf. Eqs. (19) and (13)) with ∆ω = ω 1 −ω 2 . u d (t) is plotted in Figure 2.2, left trace. This signal passes through the loop filter. In most cases the corner frequency ω C = 1/τ 2 is much lower than the lock-in range, hence we can approximate its transfer function by Now the lock process is a damped oscillation whose frequency is the natural frequency. Because the loop is assumed to lock within at most one cycle of that frequency, the lock time can be approximated by the period of the natural frequency, i.e. we have 2.2. Pull-in range ∆ω P and pull-in time T P We have seen that all signals found in this block diagram are sine functions, i.e. all of them seem to have zero average, hence do not show any dc component. This would lead to the (erroneous) conclusion that a pull-in process would not be possible. In reality it will be recognized that some of the signals become asymmetrical, i.e. the duration of the positive half wave is different from the duration of the negative one.
This creates a non zero dc component, and under suitable conditions acquisition can be obtained. We are therefore going to analyze the characteristics of the signals in Figure 10.
All considered signals are plotted in Figure 2.3. For signals I 1 and Q 1 we obtain The sum frequency terms are discarded because they are removed by the lowpass filter. The signal u d (t) is the product of I 1 and Q 1 and is given by (59). For small arguments 2∆ωt this can be written as where θ e = ∆ωt. Because the phase detector gain is defined by we have K d = m 2 1 . Next the loop filter output signal u f (t) is plotted. Its amplitude is K H m 2 1 /2, and its phase is delayed by ϕ tot = 2ϕ 1 + ϕ 2 . This signal modulates the frequency of the VCO as shown in the bottom trace of increased, which means that the average difference frequency ∆ω(t) is lowered. Consequently the duration of the positive half wave becomes larger than half of a full cycle. During the negative half cycle (duration T 2 ), however, the average value of VCO output frequency ω 2 (t) is decreased, which means that the average difference frequency ∆ω(t) is increased. Consequently the duration of the negative half wave becomes less than half of a full cycle. Next we are going to calculate the average frequency difference in both half cycles.
The average frequency difference during half cycle T 1 is denoted ∆ω d+ , the average frequency difference during half cycle T 2 is denoted ∆ω d− . We get For the durations T 1 and T 2 we obtain after some manipulations Now the average value u d can be calculated from The average signal u d is seen to be inversely proportional to the frequency difference ∆ω. Because u d is positive, the instantaneous frequency ω 2 (t) is pulled in positive direction, i.e. versus ω 1 , which means that a pull-in process will take place. Next we are going to analyze the dependence of u d on phase ϕ tot . Let us consider now the case for ϕ tot = −π, cf. Figure 2.5. We observe that in interval T 1 the instantaneous frequency ω 2 (t) is pulled in negative direction, hence the average difference frequency ∆ω d+ becomes larger. Consequently interval T 1 becomes shorter. In interval T 2 , however, the reverse is true. Here the instantaneous frequency T 1 the pulled in positive direction, hence the average ∆ω d− is reduced, and interval T 2 becomes longer. The average u d is now equal and opposite to the value of u d for ϕ tot = 0. Because it is negative under this condition, a pull-in process cannot take place, because the frequency of the VCO is "pulled away" in the wrong direction.
Last we consider the case ϕ tot = −π/2, cf. It is easy to demonstrate that u d varies with cos(ϕ tot ), hence we have Eq. (68) tells us that the pull-in range is finite. The pull-in range can be found as the frequency difference for which phase ϕ tot = −π/2. An equation for the pull-in range will be derived in section 2.2. We also will have to find an equation for the pull-in time. The model shown in Figure 11 will enable us to obtain a differential equation for the average frequency difference ∆ω as a function of time. This will be demonstrated in Section 2.2.
To solve this equation for ∆ω P we use the addition formula for the tangent function tg(2α) = 2tgα 1 − tg 2 α and can replace 2arctg(∆ω P /ω 3 ) by arctg . Eq. (70) can now be rewritten as arctg When the arctg expressions on both sides of the equation are equal, their arguments must also be identical, which leads to Hence we get for the pull-in range Last an equation for the pull-in time T P will be derived. Eqs. (68), (26), and (29) describe the behavior of the three building blocks in Figure 11 and enable us to compute the three variables u d , u f , and ∆ω. We only need to know the instantaneous ∆ω vs. time, hence we eliminate u d and u f from Eqs. (26) and (29) and obtain the differential equation This differential equation is non linear, but the variables ∆ω and t can be separated, which leads to an explicit solution. Putting all terms containing ∆ω to the left side and performing an integration we get The limits of integration are ∆ω 0 and ∆ω L on the left side, because the pull-in process starts with an initial frequency offset ∆ω = ∆ω 0 and ends when ∆ω reaches the value ∆ω L , which is the lock-in range. Following that instant a lock-in process will start. The integration limits on the right side are 0 and T P , respectively, which means that the pull-in process has duration T P , and after that interval (fast) lock-in process starts.
Finding an explicit solution for the integral seems difficult if not impossible, but the cos term can be drastically simplified. When we plot cos(ϕ tot ) vs. ∆ω we observe that within the range ∆ω L < ∆ω < ∆ω 0 the term cos(ϕ tot ) is an almost perfect straight line. Hence we can replace cos(ϕ tot ) by Inserting that substitution into Eq. (73) yields a rational function of ∆ω on the left side, which is easily integrated. After some mathematical procedures we obtain for the pull-in time T P Making use of Eqs. (58) and (60) we have Using these substitutions Eq. (74) can be rewritten as This equation is valid for initial frequency offsets in the range ∆ω L < ∆ω 0 < ∆ω P . For lower frequency offsets, a fast pull-in process will occur, and Eq. (62) should be used.

Numerical example 1: Designing an analog Costas loop for BPSK
An analog Costas loop for BPSK shall be designed in this section. It is assumed that a binary signal is modulated onto a carrier. The carrier frequency is set to 400 kHz, i.e. the Costas loop will operate at a center frequency ω 0 = 2π 400'000 = 2'512'000 rad s −1 . The symbol rate is assumed to be f S = 100 000 symbols/s. Now the parameters of the loop (such as time constants τ 1 and τ 2 , corner frequencies ω C and ω 3 , and gain parameters such as K 0 , K d ) must be determined. (Note that these parameters have been defined in Eqs. (14), (15), (16) and (62)).
The modulation amplitude is set m 1 = 1. According to Eq. (13) the phase detector gain is then K d = 1.
It has proven advantageous to determine the remaining parameters by using the open loop transfer function G OL (s) of the loop [2]. This is given by According to Eq. (76) we can set Because 2 parameters are still undetermined, one of those can be chosen arbitrarily, hence we set τ 1 = 20µs.
The design of the Costas loop is completed now, and we can compute the most important loop parameters. Next we want to compute the pull-in range. Eq. (71) yields ∆ω P = 1 086 440 rads −1 (∆f P = 173 kHz).

Numerical example 2: Designing a digital Costas loop for BPSK
To convert the analog loop into a digital one, we first must define a suitable sampling frequency f samp (or sampling interval T = 1/f samp ). To satisfy the Nyquist theorem, the sampling frequency must be higher than twice the highest frequency that exists in the loop. In our case the highest frequency is found at the output of the multipliers in the I and Q branches (cf. Figure 4).
Now the bilinear z transform has the property that the analog frequency range from 0 . . . ∞ is compressed to the digital frequency range from 0 . . . f samp /2. To avoid undesired "shrinking" of the corner frequencies (ω C and ω 3 ), these must be "prewarped" accordingly, i.e. we must set where ω C,p and ω 3,p are the prewarped corner frequencies. Now we can apply the bilinear z transform to the transfer functions of the lowpass filters (cf. Eq. (22)) and of the loop filter (cf. Eq. (15)) and get Because the VCO is a simple integrator, we can apply the discrete z transform of an integrator, i.e.
The digital Costas loop is ready now for implementation. A Simulink model will be presented in section 2.5.

Simulating the digital Costas loop for BPSK
A Simulink model of a Costas loop for BPSK is shown in Figure 2.8.  We note that the predicted and simulated parameters are in good agreement.

Remarks on simulation of BPSK Costas loop
Note that a numerical simulation of various models of the same circuit can lead to essentially different results if the corresponding mathematical assumptions, used for the models construction, are not satisfied.
Also the errors caused by numerical integration (e.g. in MATLAB and SPICE) can lead to unreliable results [28,29,50,51].
The following examples demonstrate some limitations of numerical approach on simple models.
Example 1 (double frequency and averaging). In Figure 2.9 it is shown that Assumption 1 may not be valid: mathematical model in signal's phase space (see Figure 1 -black color) and physical model (see Figure 4 and system (9) -red color) after transient processes have different phases in the locked states.
Here VCO free-running frequency ω f ree = 2 · π · 400000 − 600000; initial states of filters are all zero:  Consider now the corresponding phase portrait (see Figure 2.11). Here the red trajectory tends to a stable equilibrium (red dot). Lower and higher black trajectories are stable and unstable limit cycles, respectively. The blue trajectory tends to a stable periodic trajectory (lower black periodic curve) and in this case the model does not acquire lock. All trajectories between black trajectories (see green trajectory) tend to the stable lower black trajectory.
If the gap between stable and unstable trajectories (black lines) is smaller than the discretization step, the numerical procedure may slip through the stable trajectory (blue trajectory may step over the black and green lines and begins to be attracted to the red dot). In other words, the simulation may show that the Costas loop acquires lock although in reality it does not. The considered case corresponds to the coexisting attractors (one of which is a hidden oscillation) and the bifurcation of birth of a semistable trajectory [52].
Note, that only trajectories (red) above the unstable limit cycle is attracted to the equilibrium. Hence ∆ω = 89.45 does not belong to the pull-in range.
θ e (t) This signal modulates the output frequency of the VCO, and the modulation amplitude is given Figure 3.2. It is easily seen that the loop spontaneously locks when the peak of the ω 2 (t) waveform touches the ω 1 line, hence we have Because the transient response of the loop is a damped oscillation whose frequency is ω n , the loop will lock in at most one cycle of ω n , and we get for the lock time

Pull-in range and pull-in time for QPSK
Consider the simplified non linear model of QPSK Costas loop, cf section 1.1.2. Let us define the total phase by ϕ tot = 4ϕ 1 + ϕ 2 . Next we are computing the average phase detector output signal u d as a function of frequency difference and phase tot. First we calculate u d for the special case ϕ tot = 0. As shown in the right trace in Figure 3.3 during interval T 1 the average frequency ω 2 is increased, hence the average difference ∆ω becomes smaller. During next half cycle T 2 the reverse is true: the average difference ∆ω becomes greater, hence for ϕ tot = 0 T 1 is longer than T 2 . The modulating signal is therefore asymmetric, and because also u d (t) (left trace) is asymmetrical, its average u d becomes non zero and positive. This asymmetry has been shown exaggerated in Figure 3.3.
Using the same mathematical procedure as for BPSK Costas loop, the average u d signal is given by As in case of the Costas loop for BPSK, here again Eq. (88) tells us that the pull-in range is finite. The pull-in range is the frequency difference for which phase ϕ tot = −π/2. An equation for the pull-in range will be derived here. We also will have to find an equation for the pull-in time. To derive the pull-in process, we will use the same non linear model as used for the Costas loop for BPSK, cf. Figure 11. The transfer functions for the loop filter and for the VCO have been given in Eqs. (26) and (29), respectively.
The pull-in range can be computed using Eq. (88). Lock can only be obtained when the total phase shift tot is not more negative than −π/2. This leads to an equation of the form According to Eqs. (15) and (22) ϕ 1 and ϕ 2 are given by with ω C = 1/τ 2 . Hence the pull-in range ∆ω P can be computed from the transcendental equation Using the addition theorem of the tangent function the term 4arctg(∆ω p /ω 3 ) can be replaced by arctg Solving for ∆ω P yields Last an equation for the pull-in time T P will be derived. Based on the non linear model shown in Figure 11 and in Eqs. (26), (29), and (88) we can create a differential equation for the instantaneous difference frequency ∆ω as a function of time. For this type of Costas loop the differential equation has the Also here the cos term can be replaced by cos ϕ tot ≈= 1 − ∆ω ∆ω P and, using similar procedures as in previews section, we get for the pull-in time which again is valid for initial frequency offsets in the range ∆ω L < ∆ω 0 < ∆ω P . For lower frequency offsets, a fast pull-in process will occur, and Eq. (87) should be used.

Numerical example: Designing a digital Costas loop for QPSK
A digital Costas loop for QPSK shall be designed in this section. It is assumed that two binary signals (I and Q) are modulated onto a quadrature carrier (cosine and sine carrier). The carrier frequency is set to 400 kHz, i.e. the Costas loop will operate at a center frequency ω 0 = 2π 400'000 = 2'512'000 rad s −1 .
The symbol rate is assumed to be f S = 100'000 symbols/s. Now the parameters of the loop (such as time constants τ 1 and τ 2 , corner frequencies ω C and ω 3 , and gain parameters such as K 0 , K d ) must be determined.
(Note that these parameters have been defined in Eqs. (14), (15), (16) and (62)). It is possible to use the same parameters as for digital BPSK, i.e.

Simulating the digital Costas loop for QPSK
A Simulink model of a Costas loop for QPSK is shown in Figure 3.4.  Table 3-1.
At higher frequency offsets the results of the simulation are in good agreement with the predicted ones.
The pull-in time for an initial frequency offset of 40 kHz is too low, however, but it should be noted that the lock time T L is about 25 µs, and the the total pull-in time cannot be less than the lock time.
For the following analysis we assume that the loop is initially out of lock. The frequency of the reference signal ( Fig. 17) is ω 1 , and the frequency of the VCO is ω 2 . The output signal of multiplier M 1 is then a phasor rotating with angular velocity ∆ω = ω 1 − ω 2 . Consequently the phase output of block "Complex → mag, phase" is a sawtooth signal having amplitude (π/2) K d and fundamental frequency 2∆ω, as shown in the left trace of Figure 4.1. Because 2∆ω is usually much higher than the corner frequency ω C of the loop filter, the transfer function of the loop filter at higher frequencies can be approximated again by The output signal u f of the loop filter is a sawtooth signal as well and has amplitude (π/2) K d K H , as shown in the middle trace of the figure 4.1. This signal modulates the frequency ω 2 generated by the VCO.
The modulation amplitude is given by (π/2) K d K H K 0 , cf. right trace. The Costas loop spontaneously acquires lock when the peak of the ω 2 waveform touches the ω 1 line, hence we have Making use of the substitutions Eqn. (95) this can be rewritten as Because the lock process is a damped oscillation having frequency ω n the lock time can be approximated by one cycle of this oscillation, i.e. of the negative. The middle trace shows the output signal of the loop filter, and the right trace shows the modulation of the VCO output frequency ω 2 . From this waveform it is seen that during T 1 the average frequency difference ∆ω becomes smaller, but during interval T 2 it becomes larger. Consequently the duration of T 1 is longer than the duration of T 2 , and the average of signal u d is non zero and positive. Using the same mathematical procedure as in previews sections, the average u d can be computed from MHz, then the maximum pull-in range ∆f P is 10 MHz, i. e. ∆ω P = 6.28 · 10 6 rad/s.
As seen in the last section, the pull-in range of this type of Costas loop can be arbitrarily large. Using the same model as for BPSK Costas loop (see Figure 11), we can derive an equation for the pull-in time:

Designing a digital modified Costas loop for BPSK
The following design is based on the method we already used in section 2.3. It is assumed that a binary signal I is modulated onto a carrier. The carrier frequency is set to 400 kHz, i.e. the Costas loop will operate at a center frequency ω 0 = 2π 400'000 = 2'512'000 rad s −1 . The symbol rate is assumed to be f S = 100 000 symbols/s. Now the parameters of the loop (such as time constants τ 1 and τ 2 , corner frequency ω C , and gain parameters such as K 0 , K d ) must be determined. (Note that these parameters have been defined in Eqs. (14), (15), (16), and (62)).
It has been shown in section 4.1 that for this type of Costas loop K d = 1. The modulation amplitudes      It is easily seen that here u d is given by

Simulating the modified digital Costas loop for BPSK
The blocks shown in

A note on the design of Hilbert transformers
Hilbert transformers as used in the system of Figure 17 are implemented in most cases by digital filters.
In this application the maximum frequency in the spectrum of the modulating signal m 1 (t) is much lower than the carrier frequency f 1 . Under this condition the Hilbert transformer can be replaced by a simple delay block. All we have to do is to shift the input signal u 1 (t) by one quarter of a period of the carrier.
When the sampling frequency f S is n times the carrier frequency f 1 , we would shift the input signal by n/4 samples. This implies that n must be an integer multiple of 4.  ω C , which is defined by ω C = 1/τ 2 , and gain parameters K d and K 0 . At lower frequencies the magnitude rolls off with a slope of -40 dB/decade. At frequency ω C the zero of the loop filter causes the magnitude to change its slope to -20 dB/decade. To get a stable system, the magnitude curve should cut the 0 dB line with a slope that is markedly less than -40 dB/decade. Setting the parameters such that the gain is just 0 dB at frequency ω C provides a phase margin of 45 degrees, which assures stability [2]. From the open loop transfer function we now can calculate the closed loop transfer function defined by After some mathematical manipulations we get .
It is customary to represent this transfer function in normalized form, i.e.
G CS (s) = 2sζω n + ω 2 n s 2 + 2sζω n + ω 2 n with the substitutions where ω n is called natural frequency and ζ is called damping factor. The linear model enables us to derive simple expressions for lock-in range ∆ω L and lock time T L .
For the following analysis we assume that the loop is initially out of lock. The frequency of the reference signal ( Figure 19) is ω 1 ,and the frequency of the VCO is ω 2 . The output signal of multiplier M 1 is then a phasor rotating with angular velocity ∆ω = ω 1 − ω 2 . Consequently the phase output of block "Complex → mag, phase is a sawtooth signal having amplitude (π/4) K d and fundamental frequency 4 ∆ω, as shown in the left trace of Figure 5.2. Because 4 ∆ω is usually much higher than the corner frequency ω C of the loop filter, the transfer function of the loop filter at higher frequencies can be approximated again by The output signal u f of the loop filter is a sawtooth signal as well and has amplitude (π/4) K d K H , as shown in the middle trace of the figure. This signal modulates the frequency ω 2 generated by the VCO. The modulation amplitude is given by (π/4) K d K H K 0 , cf. right trace. The Costas loop spontaneously acquires lock when the peak of the ω 2 waveform touches the ω 1 line, hence we have Making use of the substitutions Eqn. (95), this can be rewritten as ∆ω L = π 2 ζω n .
Because the lock process is a damped oscillation having frequency ω n , the lock time can be approximated by one cycle of this oscillation, i.e.

Pull-in range and pull-in time of the modified Costas loop for QPSK
Assume that the loop is not yet locked, and that the difference between reference frequency ω 1 and VCO output frequency ω 2 is ∆ω = ω 1 − ω 2 . As shown in section 5.1 (cf. also Figure 5.2) u d is a sawtooth signal having frequency 4∆ω, cf. left trace in Figure 5 As will be explained in short, this signal is asymmetrical, i.e. the duration of the positive wave T 1 is not identical with the duration T 2 of the negative. The middle trace shows the output signal of the loop filter, and the right trace shows the modulation of the VCO output frequency ω 2 . From this waveform it is seen that during T 1 the average frequency difference ∆ω becomes smaller, but during interval T 2 it becomes larger. Consequently the duration of T 1 is longer than the duration of T 2 , and the average of signal u d is non zero and positive. Using the same mathematical procedure as in sections 2. 3 and 3.3 the average u d can be computed from Because this type of Costas loop does not require an additional lowpass filter, the u d signal is not shifted in phase, and therefore there is no cos term in Eqn. (108). This implies that there is no polarity reversal in the function u d (∆ω), hence the pull-in range becomes theoretically infinite. Of course, in a real circuit the pull-in range will be limited by the frequency range of the VCO is capable to generate. When the center frequency f 0 of the loop is 10 MHz, for example, and when the VCO can create frequencies in the range from 0 . . . 20 MHz, then the maximum pull-in range ∆f P is 10 MHz, i. e. ∆ω P = 6.28 · 10 6 rad/s.
As seen in the last section, the pull-in range of this type of Costas loop can be arbitrarily large. Using non linear model (11) we can derive an equation for the pull-in range:

Designing a digital modified Costas loop for QPSK
The following design is based on the method we already used in section 4.3. It is assumed that two binary signals (I and Q) are modulated onto a quadrature carrier (cosine and sine carrier). The carrier frequency is set to 400 kHz, i.e. the Costas loop will operate at a center frequency ω 0 = 2π400 000 = 2 512 000 rad s −1 .
The symbol rate is assumed to be f S = 100 000 symbols/s. Now the parameters of the loop (such as time constants τ 1 and τ 2 , corner frequency ω C , and gain parameters such as K 0 , K d ) must be determined. (Note that these parameters have been defined in Eqs. (14), (15), (16), and (62)).
It has been shown in previews sections that for this type of   The predictions come very close to the results obtained from the simulation.

An alternative structure of the modified Costas loop for BPSK
As demonstrated in Figure 19 the phase error signal u d was obtained from the phase output of block "Complex → mag, phase". The phase of the complex input signal to this block can be obtained from the arctg function. This imposes no problem when a processor is available. This is the case in most digital implementations of the Costas loop. As an alternative a phase error signal can also be obtained directly from the imaginary part of multiplier M 2 ; this is shown in Figure 5.5. It is easily seen that here u d is given by The blocks shown in Figure 5

Acknowledgements
This work was supported by Russian Science Foundation (project 14-21-00041) and Saint-Petersburg State University

Hold-in range for lead-lag filter
One needs to be cautious using model in Figure 5 even for calculating hold-in range for BPSK Costas.
Consider an example: Costas loop with lead-lag loop filter F (s) = 1 + sτ 2 1 + sτ 1 , τ 1 > τ 2 > 0 (110) and low-pass filters LPFs In locked state phase error θ e satisfies therefore we get a bound for the hold-in range In order to find hold-in range we need to find poles of the closed-loop transfer function (roots of the characteristic polynomial) for the linearized model (small-signal model) of the system on Figure 4. Openloop transfer function is (1 + τ 2 s)K 0 K d cos(2θ eq ) + s(1 + s ω 3 )(1 + τ 1 s).
Phase error θ eq corresponds to hold-in range (see (112)) if all roots of the polynomial (115) have negative real parts (i.e. polynomial (115) is stable). Applying Routh-Hurwitz criterion to study stability of the polynomial, we get that for the following parameters polynomial (115) is stable for all |∆ω 0 | < K0K d 8 . However, if the following condition is necessary for stability cos(2θ eq ) < 2