A Fast and Stable Method for Modeling Generalized Nonlinearities in Power Electronic Circuit Simulation and Its Real-Time Implementation

Nonlinearities have been the major obstructions that limit the computational efficiency in power electronic circuit simulation for a long time. Yet there is no standard way for dealing with them. This paper presents a new method that makes the handling of nonlinearities fast and stable. In the proposed method, nonlinearities are transformed into a uniform representation—a constant resistor in parallel with a companion current source, thus making the system admittance matrix constant for fixed time-step simulation. To solve for the corresponding companion current source, nonlinearities are treated as either current or voltage sources and a diagonal time-varying matrix equation is developed. Three methods are proposed for solving the matrix equation—precomputed inversion or factorization, modified Gaussian elimination, and updating inverse using the Sherman–Morrison formula—that can fit different system sizes and applications. The proposed method is validated by two common power electronic converter topologies, both in offline and real-time simulation. Offline tests show that the proposed method achieved the same accuracy with the mature simulation software while being more than ten times faster. The same test cases are also implemented into field programmable gate arrays based real-time simulation experiments for verification.


I. INTRODUCTION
M ODERN power electronic systems need more powerful simulation tools to meet the demands of speed and accuracy as the system complexity increases accordingly.Some systems, such as the modular multilevel converter (MMC), contain hundreds of components, including both linear (capacitors and inductors) and nonlinear (semiconductor switches) components in one circuit instance.The large number of switching elements in the MMC introduces a challenge for modeling the converter on electromagnetic transient simulation programs [1].The linear components can be dealt with properly using numerical integration methods, such as backward Euler or the trapezoidal method, whereas the nonlinear components still pose the bottleneck of simulation performance.
Nonlinearities bring uncertainties into the system admittance matrix for nodal analysis and should be taken care of differently according to their characteristics [2].Basically, there are two ways to deal with nonlinearities: iterative and noniterative.The iterative methods combine the linear network equations with the nonlinear characteristic equations and try to find a solution that satisfies both of them.Some iterative algorithms, such as the Newton-Raphson method, are utilized to search for the solution.The iterative solution (if it exists) is always theoretically accurate (i.e., no approximations are made).One example of an iterative model is the zinc oxide arrester model in the EMTP [3] and the online monitoring program in [4], which uses an analytic equation to model its behavior.Other examples can be found in the works of [5]- [8].However, there are mainly two drawbacks to iterative methods.First, there is no theoretical guarantee that the iteration solution can be found; in some cases, the iteration methods do not converge.Second, the iterative search is usually time consuming for a required accuracy because the convergence speed varies case by case.Thus, iterative approaches are typically employed in high accuracy and nontime critical applications, usually offline simulation for a limited system size.Some simulation tools, such as SaberRD, Pspice, and PAN academic circuit simulator (described in [9]), belong to this category.
Noniterative methods usually sacrifice some accuracy to trade for simulation speed so that they can be utilized in time-critical applications, such as real-time simulation or to handle larger systems.There are numerous effective schemes to simplify nonlinearities.Many of them are designed based on the specific nonlinearity's characteristics, and therefore cannot be promoted as general methods.Among these approaches, two of the most commonly used are: current-source representation with time lag Δt and piecewise linear representations [3].The first approach is easy to understand and implement.At simulation time t, all variables are known at t − Δt (here a fixed time-step method is assumed); thus, they can be used to calculate the current flow of nonlinear element at time t − Δt and inject it as a current source at time t.Some very complicated or large-inertia nonlinear elements are implemented using this way, such as electric machines in PSCAD/EMTDC [10] and the work in [11].However, under this principle, any sudden change in voltage causes a current response only in the next time step.Thus, for the previous time step, the machine looks like an open circuit and spurious spikes may appear in the machine terminal voltage [12].In order to alleviate this phenomenon, some modifications have to be made.In [12], a resistance is placed in parallel with the machine to reduce the spurious spikes and an additional compensated current source is added to offset the current introduced by this resistance.The current source representation with time lag Δt runs fast in simulation and is accurate provided that the time step is small enough.This condition, however, is not difficult to meet as the digital processor performance has increased dramatically over the years.
The piecewise linear approach divides the nonlinear element characteristics into several working sections and models it as a linear component with different parameters to fit its characteristic in each section.Typical example of this method is the simple switch model in PSCAD/EMTDC.During ON state, it acts as a very small resistance while in OFF state a very large resistance [10].Other piecewise linear model applications can be found in the works presented in [13]- [18].This approach can achieve a reasonable accuracy while retaining an elegant mathematic form.However, the varying parameters will alter the system admittance matrix and each time it is changed, the triangularization or inversion process has to be reconducted, thus lowering the simulation speed.If there are lots of nonlinear elements in a system, such as the MMC, the execution rate can be extremely low; therefore, making this approach impractical.There are some efforts to make the switch model equal for both ON and OFF states [19]- [20] so as to make the admittance matrix constant.However, such methods either induce inaccurate power losses or require complicated revising calculations, and cannot be promoted to model other nonlinearities.Pejović and Maksimović [21] present a method to simulate the piecewise linear elements using constant admittance matrix and maintain speed advantage over general purpose simulation tools; however, the equivalent resistor's value and the computation of companion current source of piecewise linear elements are not addressed fully.
These two approaches have covered the most scope of noniterative nonlinear element modeling, yet they cannot replace one another.For components like electric machines, it is very difficult to find their equivalent linear circuits because their characteristics change continually with rotor position, angular velocity, magnetic flux saturation, etc.On the other hand, components like switches are not suitable for current source representations with time lag Δt because during ON state, they behave more like a voltage source (i.e., zero terminal voltage with almost arbitrary current).Due to the limitation of these two modeling methods, they have to coexist in one simulation; however, the different mathematic representations make the solution process complex and lower the execution efficiency.
Some other methods decouple the nonlinear components manually so that they can be dealt independently [22]- [25].These include the latency insertion method and transmission line modeling method.Such methods introduce artificial latency into a system (by inserting small inductive or capacitive elements) so as to permit a complete (approximate) decoupling of the system equations [22].The side effect is that the inserted inductive or capacitive elements may distort the original component's behavior when its time constant is very small.Therefore, sometimes it is very difficult to choose the appropriate latency or transmission line parameters.
In this paper, a generalized method for modeling nonlinear elements in power electronic circuit simulation is presented.This method can deal with any form of nonlinear characteristic using a uniform mathematic representation, thus making the admittance matrix constant.By properly selecting the equivalent resistor, the time-varying part of the matrix to solve nonlinear elements is restricted only on diagonals.Three methods to solve the diagonal time-varying matrix equations are presented correspondingly.These properties can significantly reduce the execution burden of the simulation program.In addition, the system is decoupled naturally and only when there are real inductances or capacitances; thus, high accuracy is retained.Besides, the method does not have instability risk because it is inherently stable.Offline and real-time simulation results of two typical power electronic circuits-MMC and neutral point clamped (NPC) topologies-are provided and compared with the existing simulation softwares both at system level and device level.This paper is organized as follows.Sections II and III introduce the uniform representation of nonlinear elements and the classification of nonlinear behaviors.Section IV elaborates the solution process of nonlinear elements' companion current source.Sections V and VI present the offline and real-time simulation validation results and Section VII concludes this paper.

II. UNIFORM REPRESENTATION OF NONLINEAR ELEMENTS
Most of the circuit simulation programs choose node voltages as the unknown variables and use nodal analysis to develop a set of linear equations by applying Kirchhoff's current law (KCL) at every circuit node.In nodal analysis, it is preferable to use the Norton equivalent circuit (a current source in parallel with a resistive branch) to represent the circuit element.Fig. 1 shows the Norton equivalent representation of lumped linear inductance and capacitance.
The equivalent resistor R eq is obtained using the trapezoidal rule of integration and it is a constant for a fixed time-step Δt.The companion current source I com is calculated from the past history (the previous time-step variables) [2].However, when this concept is applied to nonlinear elements, usually both R eq and I com are time varying; thus, making the analytical solution process complex and the computational complexity of solving a time-varying matrix equation is O(N 3 /3).Actually, the potential of I com has not been exploited fully yet.I com can be calculated not only from history variables but also from current time-step variables.Assume that there is a nonlinear element connecting an external network at node k and m, as shown in Fig. 2. No matter what kind of characteristics this nonlinear element possesses, ultimately there exists a solution for terminal voltage v km (t) and branch current i km (t) at simulation time t.From the viewpoint of the external network, as long as ( 1) is satisfied, it cannot distinguish whether the component between nodes k and m is a nonlinear element or a constant resistor R eq in parallel with a current source I com As can be seen from Fig. 2, the equivalent resistor for the nonlinear element is a constant, just like the linear element in a fixed time-step simulation.Thus, the system admittance matrix becomes constant since only the equivalent resistor will appear in the admittance matrix.As a result, the computational complexity for solving node voltages reduces to O(N 2 ).
The only problem of applying this representation for nonlinear elements is how to find this I com (t) that satisfies (1) since both v km (t) and i km (t) are unknown.Actually, (1) is difficult to solve because not only the relationship between v km (t) and i km (t) is nonlinear but also that the external network may contain multiple nonlinear components.The theoretic accurate solution cannot be found without using iterative methods.In order to save execution time, approximations have to be made like all other noniterative methods.

III. CLASSIFICATION OF NONLINEAR BEHAVIOR
Before introducing the appropriate approximations that should be made to match this uniform representation, the nonlinear behaviors are classified first.In the authors' opinion, nonlinear behavior can be roughly classified into two categories: current-source-type behavior and voltage-source-type behavior.Some nonlinear elements always behave as one type, but others may transfer types according to their working condition.The classification is done mainly by empirical conventions and can be programmed into the model as an attribute for reuse.Table I lists some conventions of nonlinearities.

A. Current-Source-Type Behavior
Typical current-source-type behaviors include the output characteristics of PV panel, an electric machine under load  mode, a semiconductor switch in turn-OFF transient and steady OFF state, etc.What they have in common is that they can be viewed as a current source in the circuit, and more importantly, their current can be predicted precisely based on the history (for example, the previous time step) information.Take the semiconductor switch as an example, the turn-OFF transient characteristics can be obtained experimentally and recorded to develop a turn-OFF analytical function, as shown in Fig. 3(a).Some of the characteristic parameters, such as turn-OFF delay time t d(off ) and fall time t f , can be extracted from the experimental data.Finally, a per-unit turn-OFF function can be obtained and used to generate the actual waveforms by scaling with the current amplitude in simulation.The key point of this method is to determine the current value at the next time step by a predefined trajectory and this trajectory comes from many experimental measurements to achieve a high precision.This approach has a very good curve-fitting performance and offers reasonable accuracy for current prediction in turn-OFF transient [11], [26].As for the steady OFF state, the switching device can be viewed as a current source whose current is zero.Other elements, such as PV panel and electric machine, do not have a predefined trajectory.Their current can be predicted simply because of system inertia.Provided that the time step is small enough, their current can be updated using previous time-step information and assumed to be consistent over one time step.

B. Voltage-Source-Type Behavior
Typical voltage-source-type behaviors include the output characteristics of most kinds of batteries, an electric machine under generator mode, semiconductor switches in turn-ON transient and steady ON state, etc.What they have in common is that they can be viewed as a voltage source in the circuit, and more importantly, their voltage can be predicted precisely based on the history information.For example, the voltage trajectory of a semiconductor switch in turn-ON transient can be tested and recorded to develop a per-unit function.When scaled with the voltage amplitude in simulation, it can predict the voltage value at the next time step.Other elements' voltages, such as batteries and electric machines, can be predicted because of their system inertia.
One may ask, since the voltage and current trajectories of semiconductor switches can be recorded simultaneously in experiments, why under turn-OFF transient should it be treated as a current source and under turn-ON transient as a voltage source?That is because under turn-OFF transient, the conducting current will decrease from some value to zero, which means that the initial and final values of current are known during this period.Then, the initial current value can be used as the scaling factor of the per-unit function.While in the same period, the voltage across the switch will increase from almost zero to some value that the program does not know.The voltage per-unit function is useless without an appropriate scaling factor.Actually, the voltage waveform of a semiconductor switch under turn-OFF transient is usually determined by its complementary device's voltage turn-ON trajectory.This can be illustrated in Fig. 4. If S 1 is under turn-OFF transient, then S 2 or its antiparallel diode will be under turn-ON transient.S 1 will act as a current source and S 2 a voltage source.The current of S 2 is determined by the difference of i L and i S1 while the voltage of S 1 is determined by the difference of v C and v S2 .

IV. SOLUTION PROCESS OF NONLINEAR ELEMENTS' COMPANION CURRENT SOURCES
Assume that all the elements in the system to be simulated are represented by their Norton representations, and the equivalent resistor is constant for linear and nonlinear components.Then, a set of node voltage equations can be formed by applying KCL as the matrix equality shown in the following equation: where G is a constant nodal conductance matrix, V is the node voltage vector, and I is the companion current source vector.Note that the members of I are not independent companion current sources but their linear combinations.For example, the equivalent circuit of the half-bridge topology in Fig. 4 is shown in Fig. 5 and the KCL equation at node 2 can be expressed as follows: The right-hand side of (3) can be further transformed as a constant vector multiplying the independent companion current source vector I shown in the following equation: Assume that there are N 1 nodes and N 2 independent companion current sources in the system, then the matrix equality in (2) can be recomposed as following: (5 where each row of T indicates the current source configuration at one node.T is also constant and its elements are either 0 or 1 or −1.Based on (5), the node voltage vector can be expressed as where Equation ( 6) reveals an important piece of information: given that all the components in system are represented by constant resistors in parallel with current sources, the node voltage can be expressed as linear combinations of these current sources, and more importantly, the coefficients of these current sources are constant.This is also true for any voltage difference between two nodes.For example, the voltage difference between nodes k and m at simulation time t can be expressed as Authorized licensed use limited to: UNIVERSITY OF ALBERTA.Downloaded on April 18,2022 at 18:55:05 UTC from IEEE Xplore.Restrictions apply.

R E A D O N L Y
Furthermore, the branch current between nodes k and m is given as where l is the index number of I com between nodes k and m in the system.As discussed in Section III, the nonlinear element will behave like a current or voltage source according to its nature or working condition and its current or voltage value at the next time step can be predicted with reasonable precision.Then, either i km (t) or v km (t) in ( 1) can be replaced with the predicted value.This is the approximation that is made to simplify the solution process of nonlinearities' I com (t).
Assume that there are N 3 nonlinear elements or pure voltage/current sources labeled as 1 to N 3 (here pure voltage/current sources and nonlinear elements are categorized into one group since they have no differences from the point view of the simulation program) and N 4 linear elements labeled as N 3 + 1 to N 2 in system (N 4 = N 2 − N 3 ).If the simulation program has finished the calculations at time t − Δt and is going forward to time t, the companion current source of N 4 linear elements can be computed first based on information at time t − Δt.The companion current source of the other N 3 elements should be solved based on (1).For each of these N 3 elements, if i km (t) is known, then replacing v km (t) with (7) to have R eq (9) which can be rewritten as If v km (t) is known, then replacing i km (t) with ( 8) results in (11) which can in turn be rewritten as In Section II, the equivalent resistor R eq of nonlinear element is only assumed to be constant but is not assigned any specific value.In this paper, all the R eq of nonlinearities are chosen to be 1 so as to make (10) and ( 12) to have similar form.As can be seen that if R eq =1, then apart from the additional I l (t), the left-hand side of ( 10) and ( 12) is exactly the same, and their right-hand sides are all known items.Therefore, N 3 linear equations are developed and can be written as Each row of M N 3 ×N 3 represents a current or voltage equation of one nonlinear element or pure voltage/current source.Coefficients of M N 3 ×N 3 can be computed by doing row subtraction of matrix A N 1 ×N 2 .The only thing that one has to be aware of is that an additional 1 has to be added to the diagonals of M N 3 ×N 3 if that row represents a current equation like (10).This means that the off-diagonal elements of matrix M N 3 ×N 3 are always constant, the only time-varying part is its diagonal part.
As mentioned in Section III, some nonlinear elements always behave like either current or voltage source and will not transfer types during simulation.These include PV panel, batteries, some electric machines, pure voltage/current sources, etc.Other nonlinear elements, such as semiconductor switches, will consistently transfer types between current and voltage sources.Therefore, M N 3 ×N 3 can be partitioned into blocks and ( 13) can be rewritten as where the upper equations represent the elements that will change types during simulation and the lower equations represent those that do not.This means M 12 , M 21 , and M 22 are constant, and only the diagonal of M 11 is time varying.Using lower part of ( 14), I 2 can be expressed as Replacing I 2 into the upper part of ( 14) using (15) yields the following system equations: Equation ( 16) is the core difficulty of solving the whole system states since this is the only process involving time-varying matrix.For the sake of convenience, in the remaining part of this paper, (16) will be denoted as where The following sections provide three feasible methods for solving (17).

A. Solution by Precomputed Inversion or Factorization
When the dimension of M is small, it is very convenient to compute all the possible cases of M and store its inversion or factorization in memory.The corresponding computational complexity is constant.It has to be pointed out that this method is not necessarily only restricted to systems with very few elements.Some topologies, such as MMC, are also suitable for this method.To demonstrate this, an important property is indicated and proved here-when a very small resistor lies in parallel or a very large resistor lies in series with the connecting route of two nonlinear elements, then the voltages and currents of these two nonlinear elements are almost decoupled.Since the equivalent resistor of an inductor (2L/Δt) is usually very large and the equivalent resistor of a capacitor (Δt/2C) is usually very small, this property is useful in application.
Assume that there is a very small resistor R C that lies in parallel with the route between two nonlinear elements, as shown in Fig. 6(a).The remaining part of the route is assembled in a two-port network and described by a Y matrix shown as follows: Then, the following KCL equations can be developed: 1 Solving (19), v 1 and v 2 can be expressed as where Δ is the determinant of the matrix in ( 19) Because R C is very small and Δ is a very large value.Then, the cross-coupling coefficients y 12 /Δ and y 21 /Δ in (20) are almost zero, which means I NL1 has negligible effect on v 2 and I NL2 has negligible effect on v 1 .Thus, v 1 and v 2 are almost decoupled.Similarly, when there is a very large resistor R L that lies in series with the route between two nonlinear elements, as shown in Fig. 6(b), following equations can be developed: The cross-coupling coefficient between v 1 and I NL2 is R L , and the cross-coupling coefficient between v 2 and I NL1 is where Δ is the determinant of the matrix in (22).Because R L is very large, Δ is a normal value while 1 R L is almost zero.Thus, v 1 and v 2 are almost decoupled.It can also be concluded that the larger the values of L and C and the smaller the time step, the higher the decoupling degree will be.This property does indeed coincide with the objective laws in a real system-large inductors and capacitors are usually used to add inertia into the system and smaller time steps will offer better precision.Actually, there are already some efforts in literature using the inductive and capacitive parts to decouple the circuits and they often explained it in physics.This section provides a way to explain in mathematics why inductive and capacitive parts can be used to decouple the circuits and why inductive element should lie in series and capacitive element should lie in parallel with the decoupled subcircuits.
The MMC topology has a very elegant structure, as shown in Fig. 7.A three-phase MMC has six arms, each arm has several submodules (SMs) and one arm inductor, and each SM consists of one capacitor and two switches in a half-bridge topology.The arm inductors are used to limit the circulating current within the converter.However, they also provide great facility in simulation.As can be seen in Fig. 7, the connecting route of any two SMs has two series-connected arm inductors, which means these SMs are almost decoupled from each other and can be dealt separately.Since all the SMs in the MMC share the same 2 × 2 matrix M, then it is very convenient to store its inversion or factorization in memory.This has been validated by the test case in the subsequent section.

B. Solution by Modified Gaussian Elimination
The Gaussian elimination is always a competitive method for solving linear equations.Equation ( 17) is no exception.However, considering that matrix M has constant off-diagonal elements, a modified Gaussian elimination method is provided that can reduce the computational complexity to a great extent.Assume that matrix M has an initial structure shown as follows: where v represents the variable, which has two possible values and c represents the constant.Unlike the traditional Gaussian elimination, the first step of the modified method produces zeros in the last column using row 1 elements.After Step 1, matrix M becomes ... v where m (1) i,j = m i,j − m i,N m 1,j /m 1,N , for 2 i N, 1 j N .The constants in the first column and the last row become variables because of Step 1 elimination while the other constants are still constants but with another value.
Similarly, Step 2 produces zeros in the second last column using row 2 elements and matrix M (1) becomes The constants in the second column and the second last row become variables while other constants remain constant.After (N + 1)/2 steps, the structure of matrix M will be like the one shown in Fig. 8, where P is a (N + 1)/2 × (N + 1)/2 variable matrix.Since the right-half block of M ( (N +1)/2 ) corresponding to P is all zeros, solving P using the traditional Gaussian elimination can obtain the first (N + 1)/2 unknowns of I.After this, substituting them into Q and R, the other unknowns of I are also obtained.Note that the first (N + 1)/2 steps of the modified Gaussian elimination is quite simple and can be conducted in advance.It can be proven that every element of matrix P has at most four possible values, whereas the element of matrix R has at most two possible values and matrix Q is constant.So it will not take much memory to store them.
Because the dimension of P is almost half as that of M and the Gaussian elimination is a O(N 3 /3) problem, if the number of nodes is approximately equal to the number of nonlinear elements, then the computation load corresponding to matrix P is approximately one-eighth of that of the traditional method, i.e, solving (2) using a time-varying G. Considering that the system may be partitioned into several subblocks due to the property discussed in Section IV-B, this number can be even smaller.

C. Solution by Updating Inverse Using the Sherman-Morrison Formula
The Sherman-Morrison formula is a useful tool to update the inverse of an original matrix after a rank-1 modification [27].Assume u is a column vector and v is a row vector, then the inverse of matrix M + uv can be expressed as where σ is a scalar and σ = 1/(1 + vM −1 u).
Considering that matrix M has variables only on diagonal, any two possible variations of M can be mutually transformed by a series of rank-1 modifications.The computational complexity of calculating the modified inverse M −1 , where m is the number of diagonals of M 1 that differ from those of M 0 .If there are enough base matrices that have been computed and prestored in memory so that m N/3 , then this method will have advantage over the traditional Gaussian elimination method.Obviously, more base matrices reduce the computational complexity but require more memory space.It has to be pointed out that the actual diagonal combinations of M is far less than 2 N , for example, Authorized licensed use limited to: UNIVERSITY OF ALBERTA.Downloaded on April 18,2022 at 18:55:05 UTC from IEEE Xplore.Restrictions apply.

R E A D O N L Y
M that consists of all voltage equations is most likely to be singular because all the power electronic switches cannot be switched ON simultaneously.This will significantly reduce the memory storage demand.
Note that formula (26) may cause numerical error when (1 + vM −1 u) is very small.This happens frequently because the diagonal elements of M −1 are often very close to ±1.To the best of author's knowledge, when choosing the base matrix M 0 , it is better to select certain rows and add an additional 1 to the diagonal elements of current equation and subtract an additional 1 from the diagonal elements of voltage equation and leave the additional modified diagonal elements to be the last steps of the Sherman-Morrison update process.This will avoid the numerical distortion in most cases.
The above-mentioned three methods provide three perspectives to solve diagonal time-varying linear equations.However, they each have their own advantages and eligible applications.When solving a new system, decoupling is always the first consideration.After the decoupling step, a proper method needs to be chosen.Generally speaking, precomputed inversion or factorization is suitable for small size M; the Sherman-Morrison method is suitable for medium size; and the Gaussian elimination is suitable for large size.In addition, the Gaussian elimination is more suitable for sequential programs (such as CPU-based program) because of low computational complexity, whereas the Sherman-Morrison method is more suitable for parallel programs (such as FPGA-based program) because the intermediate quantities do not depend on each other; thus in each process, the N 2 updating values can be calculated simultaneously, which makes the O(mN 2 ) problem O(m) with respect to computation time.
Another important feature that has to be pointed out is that this method is always numerically stable, no matter what the nonlinearities are and how they connect with each other.This is because the equivalent representation of all the nonlinear elements has an internal resistor.Thus, from the system's point of view, it looks like to deal with a pure resistor network, and the only time-varying part is the companion current source and the system itself does not have any instability risk.

V. OFFLINE SIMULATION VALIDATION
The MMC and NPC converters are two commonly used topologies in power electronic industry and will be used as two test cases to validate the proposed simulation method.The output waveforms and computation efficiency are compared with the existing system-level and device-level simulation tools-PSCAD/EMTDC and SaberRD.

A. Offline Simulation Results of MMC
The MMC topology is an effective configuration to achieve high voltage and power using moderate switching devices.It is widely used in the high voltage direct current (HVDC) transmission systems around the world.For a long time, simulation of MMC topology behavior was a tough task because it has too many semiconductor switches in one converter.But using the proposed method, this task becomes comparatively easy because all the SMs in MMC are decoupled from each other in nature.
A three-phase five-level MMC is constructed and served as the test case and its parameters are listed in Table V (see the appendix).The resulting waveforms of this five-level MMC from PSCAD/EMTDC and a C-program employing the proposed method are shown in Fig. 9.The companion current source of nonlinear elements are solved by precomputed inversion, as discussed in Section IV-A.The modulation scheme adopted here is the phase shift pulsewidth modulation (PWM) with a displacement angle of 45 • between upper and lower arms [28], [29].As for the capacitor voltage, a combination of averaging and individual proportional-integral (PI) controllers are designed to achieve voltage balancing (see reference [30]).The dc-link voltage is set to soft start from 0 to rated value in 50 ms and the simulation time step is 500 ns.Differences in these waveforms can barely be observed.Further calculations show that the total harmonic distortion (THD) of line-line voltage between A and B phases (v A−B ) is 9.82% (PSCAD/EMTDC results) and 9.81% (the proposed method results), respectively, and the THD of the load current of phase A (i A ) is 1.25% for both the results.Two capacitor voltages, namely the upper most and lower most SM of phase A leg (C A1 and C A8 ), are selected to be drawn in Fig. 9 The nonlinear element, i.e., the insulated-gate bipolar transistor (IGBT), is another major concern of the simulation accuracy.In PSCAD/EMTDC, an IGBT is modeled as a two-state resistance (R on and R off ) and the switching ON and OFF waveforms of an IGBT are shown in Fig. 10(a).As can be seen, in ON state, the voltage across the IGBT's collector and emitter (v CE ) is always approximately to zero, whereas in OFF state, the IGBT's collector current (i C ) is always approximately to zero and the switching transition between ON/OFF states is accomplished in one time step.As a comparison, the proposed method based program also applies a simple switch model to imitate the ideal switch, namely treat the switch as a voltage source whose value is zero in ON state and a current source whose value is zero in OFF state and no switching transient is assumed.The resulting waveforms are shown in Fig. 10(b).The nonlinear element behaviors from these two simulation tools are also very close and they both resemble the ideal switch.
The computational efficiency may be the most significant difference between these two tools.Execution time test is conducted between PSCAD/EMTDC and the C-program.Because comparison between a general purpose software and a specific C-program is not particularly fair, the following efforts have been made to make it fairer.
1) The compile process of the program is not included and only the execution time is accounted for.2) When running the PSCAD/EMTDC model, no graphic output is added so as to avoid extra time for drawing figures.
3) The execution time for the C-program is measured in debug mode rather than in release mode.To conduct a 100-ms simulation using time step of 500 ns on a computer with Intel i7-3770 processor, PSCAD/EMTDC spends around 34 s, whereas the proposed method based program only costs 3.361 s, which is approximately ten times faster.This demonstrates that the proposed method has great advantage over the traditional method in terms of computational efficiency while maintaining almost the same accuracy.As the system includes more nonlinear elements, this advantage will become more significant.

B. Offline Simulation Results of NPC Topology
The NPC topology is another commonly used multilevel converter that is often applied in medium power circumstances and the three-level NPC converter is the most mature one.In this test case, a three-level NPC converter fed permanent magnet synchronous machine (PMSM) drive system is constructed.The configuration of the test setup and its equivalent circuit as well as the control scheme are illustrated in Fig. 11.
There are 12 nodes, 3 linear elements (1 source and 2 capacitors), and 19 nonlinear elements (12 IGBTs, 6 diodes, and 1 machine) in this system.Because the machine is a threeterminal element, it is represented by three resistors and the corresponding companion current sources.However, the threephase currents of the machine are not independent, their sum is always zero, which means there are only two degrees of freedom, thus one companion current source (I MC ) can be set to zero.Then, matrix A is of 12 × 23 and M is of 20 × 20.Among the 20 nonlinear elements, the IGBTs and diodes will transfer types during simulation while the machine will always be treated as a current source.This means, as in ( 14), M can be partitioned into a 18 × 18 M 11 and a 2 × 2 M 22 .When applying matrix transformation, as shown in (16), one would find that the 18 × 18 diagonal time-varying matrix is transformed into three identical 6 × 6 decoupling matrices.Each matrix represents one phase leg of the three-level NPC converter.It makes sense because the three-phase legs connect in parallel with the capacitor C 1 and C 2 whose equivalent resistor is very small.As explained in Section IV, a small resistor will decouple the subsystems that connect in parallel with it.The decoupling nature of the nonlinear elements significantly reduces the computational burden.In this test case, the time-varying 6 × 6 matrix equations is solved by the modified Gaussian elimination method discussed in Section IV-B.Note that node 0 in Fig. 11 is a virtual node that has no physical meaning.It only serves as a reference point and even if the stator windings of PMSM connect in wye type, the potential of node 0 is not equal to that of the neutral point of the wye configuration.
The parameters of the system are listed in Table VI (see the appendix).The PMSM model applied here is a fifth-order model that has three stator windings and two short-circuited damping windings.
The output simulation waveforms of PSCAD/EMTDC and the proposed method based C-program at the time step of 200 ns are shown in Fig. 12.The machine is set to start up at half load and the closed-loop space vector pulsewidth modulation (SVPWM) control strategy is employed where the d-axis current is tuned to zero and the q-axis current is tuned to 1 kA.The differences of waveforms from these two simulation tools can hardly be distinguished.Only in very small time-scale scope, one can find the slight discrepancies, as denoted in Fig. 12(d) and (h).This is because in PSCAD/EMTDC, the switch devices are simple R on /R off model, whereas in the new method based program, the realistic IGBT and diode switching characteristics are considered.Actually, the waveforms in Fig. 12(h) are more close to experimental results (see [31,Fig. 8] and [32, Fig. 9]).There are always voltage spikes during the voltage-level transition and they are indeed caused by the switching transients of semiconductor devices.
To further validate the accuracy of the proposed method based program, the IGBT and diode switching transient waveforms in NPC topology are also recorded and compared with SaberRD that has very sophisticated semiconductor models.The target IGBT and diode part number employed in simulation are FZ1500R33HL3 from Infineon and VS-SD1053CS20L from Vishay.The switching parameters of these two devices from datasheet are listed in Table II.
The outcome switching transients from these two simulation tools are illustrated in Fig. 13 and the corresponding switching parameters are also shown in Table II.As can be seen, the switching transient processes are highly nonlinear and can take several microseconds.Both these two simulation tools reproduced the switching transients properly.The key parameters (t d(on) , t r , t d(off ) , t f , t rr ) are very close to that given in datasheet.In this sense, the proposed method based  Again, the computational efficiency performance of these simulation tools differ greatly.To conduct a 1-s simulation of this NPC-fed PMSM drive system at the time step of 200 ns on an Intel i7-3770 processor computer platform, PSCAD/EMTDC spends around 143 s, whereas the proposed method based program only consumes 13.288 s-more than ten times faster.As for the SaberRD, it takes several hours to finish the same test setup due to the complexity of its iterative algorithms and it utilizes the variable time step [33], thus its computational efficiency is not compared here.
The above-mentioned two test cases demonstrate that the method proposed in this paper is feasible in offline simulation.It can give almost the same accuracy results of mature simulation software while spending far less execution time.In addition, this method has high generality.It can be applied to deal with most of (if not all)the nonlinearities in the major power electronic circuit topologies while retaining its high computational efficiency.

VI. REAL-TIME SIMULATION VALIDATION
Real-time simulation is widely accepted by industrial engineers and designers as it has the ability to offer hardware-in-theloop tests that are more close to real applications compared with offline simulation.In real-time simulation, the demand for computational efficiency is extremely high because it has to update the states of emulated system within every time step.In most cases, real-time simulation is very hard to achieve and simplifications have to made to trade accuracy for speed.However, the method proposed in this paper lowers the computational burden while still retaining almost the same accuracy with the traditional method.This means it is very suitable for real-time simulation.In addition, the constant admittance matrix exploits the advantage of parallel computing to its full strength as every node voltage can be calculated independently and simultaneously.
In this section, the above-mentioned two test cases are implemented in real time on the Xilinx Virtex UltraScale+VCU118-ES1 FPGA evaluation platform.The FPGA-based real-time simulator has merits such as low cost, low hardware utilization, easy maintenance, easy reprogramming, and high reliability.Because of the good generality of the proposed method, there is no special treatment of the circuit model itself like many other real-time simulation programs.Thus, the signal flowchart for these two test models is common and is given in Fig. 14.The whole project is divided into two parallel working modules: the control and the model module.The control module is designed to generate driving signals of semiconductor devices.Some of the common algorithms, such as coordinates transformation, PI controller, SVPWM, and PWM generator, are implemented here.The model module is responsible for updating the states of the simulated system and is partitioned into two sequential tasks: calculating the node voltages and calculating the companion current sources.The node voltages can be obtained by direct calculation using (6) or through lower-upper (LU) factorization of (2).In the FPGA-based program, the former is preferred although solving (2) has higher precision because ( 6) is more suitable for parallel programming and thus saves execution time; however, it is not always the case as it depends on the programmer and system sizing.
Calculation of the companion current sources is the key feature of this new method.It is partitioned into two successive processes-updating the states of linear and nonlinear elements.The linear elements will be dealt with first either by the trapezoidal rule or the backward Euler rule or any other integration method.Trapezoidal rule is always stable and has an error term of O((Δt) 3 ), whereas backward Euler rule has simpler form but lower precision.The updating of nonlinear elements starts with judgment of them to be treated as current or voltage sources.The judgment condition can be obtained from control signals or system states.After this step, the exact value of the equivalent current or voltage source is calculated in accordance with the nonlinear functions and a new matrix M is developed accordingly.Finally, the diagonal time-varying matrix equations is solved by either precomputed inversion or factorization or modified Gaussian elimination or updating inverse using the Sherman-Morrison formula.The selection of algorithm depends on many factors, especially the system sizing as discussed in Section IV.Considering that the program is implemented on the FPGA board, the first and third methods will have higher priority.In the two test cases that are considered here, the MMC model will use precomputed inversion and the NPC-fed PMSM drive model will employ the updating inverse by the Sherman-Morrison formula.

A. Real-Time Simulation Results of MMC
As discussed in Section IV, the matrix M in the MMC model is of 2 × 2 and has only two possible variations corresponding to upper switch ON lower switch OFF mode and upper switch OFF lower switch ON mode, respectively.Therefore, the process of updating the states of nonlinear elements is easy as these two possible M matrices are precomputed and stored in memory.The only challenge comes from calculating the node voltages.Because the MMC topology has many switches, the matrix A in (6) has a size of 53 × 82 for a five-level topology.If they are implemented fully parallel in FPGA, more than 4000 multipliers are needed and each multiplier will consume several digital signal processor (DSP) blocks that is unbearable for the existing FPGA board.To reduce the hardware consumption, several techniques are adopted here.
1) The data format employed in this test case is a 59-b fixed point number to guarantee lower DSP blocks consumption and efficient precision as well.Although the 59 × 59 fixed point multiplication has to use ten DSP blocks that is higher than double-precision floating point format, the addition of fixed point numbers is much simpler than floating point numbers.Thus, the overall hardware consumption is reduced.
2) The calculation process of each node voltage that involves 82 multiplications and 81 additions is partitioned into 4 groups to deal with 19, 20, 21, and 22 multiplications, respectively.Each group is instantiated by a multiply-accumulate module.This means that each multiplier is reused about 20 times, and more importantly, the pipeline technique is included.
With the help of these techniques, the 500 ns time-step realtime simulation of the five-level MMC test case discussed in Section V-A is realized.The system clock implemented is 100 MHz, which means that in every time step, the state updating is accomplished within 50 system clocks.The hardware utilization and real-time implementation waveforms are illustrated in Table III and Fig. 15, respectively.The results are almost the same with offline waveforms except they are real time.

B. Real-Time Simulation Results of NPC Topology
The NPC-fed PMSM drive system experienced a much simpler node voltage calculation process as its system size is much smaller.What should be paid attention to is solving the diagonal time-varying matrix equations.As analyzed in

TABLE IV HARDWARE UTILIZATION OF THE THREE-LEVEL NPC MODEL
LUT-look-up table; FF-flip-flops; BRAM-block random access memory Section V-B, M is a 6 × 6 matrix, but the possible variations of M are much smaller than 2 6 ; actually there are only four variations representing four possible switching combinations of one phase leg, as indicated in Fig. 16.Note that there is no need to worry about the switching pair deadtime as the deadtime modes are already included in the above-mentioned four possible modes.Observing the four possible switching combinations, modes (a) and (c) have four same switches (S 2 , S 4 , D 1 , and D 2 ) and two different switches (S 1 and S 3 ), and modes (b) and (d) have four same switches (S 1 , S 3 , D 1 , and D 2 ) and two different switches (S 2 and S 4 ).Thus modes (b) and (c) can be served as the base matrices in the Sherman-Morrison formula updating process.From mode (c) to mode (a) and from mode (b) to mode (d) only takes two updating processes.Furthermore, in order to accelerate the renewal of system states in every time step, the Sherman-Morrison updating process is not included in the main sequential computation loop of system states in the model module, but rather works in parallel with the main loop just like the control module.This means, whenever there is a new mode coming, the control module will be responsible for updating the new inverse of matrix M. It is when the new inverse of matrix M is ready that the new mode will take effect.This kind of operation will introduce some delay in the control loop, however, considering that there is always real delay (approximately half the carrier period of switching devices) in power electronic converter control and the introduced Sherman-Morrison updating process delay is less than 1 μs, which is much smaller than half the carrier period, the resulting effect is negligible.In this arrangement, the NPC-fed PMSM drive system achieved 200 ns time step on the 100-MHz system clock FPGA board.The resulting hardware utilization is summarized in Table IV and the real-time implementation waveforms are shown in Fig. 17.

VII. CONCLUSION
This paper provided a new perspective to deal with nonlinearities in a fast and stable manner in power electronic circuit simulation.The proposed method utilizes a uniform representation to substitute the nonlinearities in order to keep the system admittance matrix constant.The nonlinear elements then can be further viewed as current or voltage sources according to their nature or system states and the corresponding current or voltage equations are developed accordingly.Therefore, the computational difficulty is transformed to solve a diagonal time-varying matrix equation.Three methods are provided to solve it, which can cover most circumstances that are encountered in practical applications.The offline and real-time test cases validated the feasibility of this new method and also demonstrated its fast simulation speed.In addition, the proposed method has very good generality: it can be used to deal with most nonlinearities in major power electronic topologies while achieving comparable accuracy as with the mature simulation software at both the system level and device level.This method can be a good alternative algorithm to accelerate the speed in power electronic circuit simulation and it is also suitable to be made into a drag and drop tool incorporated with existing commercial softwares.

APPENDIX
See Tables V and VI.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA.Downloaded on April 18,2022 at 18:55:05 UTC from IEEE Xplore.Restrictions apply.

Fig. 9 .
Fig. 9. Comparison of five-level MMC simulation waveforms between PSACD/EMTDC and the proposed method based program.(a) Line-line voltage (v A −B ) and load current (i A ) waveforms from PSCAD/EMTDC.(b) Counterparts of (a) from the proposed method based program.(c) SM capacitor voltage (V C A 1 and V C A 8 ) waveforms from PSCAD/EMTDC.(d) Counterparts of (c) from the proposed method based program.

Fig. 10 .
Fig. 10.Comparison of switching ON and OFF waveforms between PSCAD/EMTDC and the proposed method based program.(a) PSCAD/EMTDC results.(b) Proposed method based program results.

Fig. 12 .
Fig. 12.Comparison of three-level NPC-fed PMSM drive system start-up waveforms between PSCAD and the proposed method based program.(a) PMSM three-phase stator current from PSCAD/EMTDC.(b) NPC line-line voltage between phases A and B from PSCAD/EMTDC.(c) Magnified waveforms of (a) and (b).(d) Magnified waveforms of (c).(e)-(h) Counterparts of (a)-(d) from the proposed method based program.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA.Downloaded on April 18,2022 at 18:55:05 UTC from IEEE Xplore.Restrictions apply.

Fig. 14 .
Fig. 14.Signal flowchart of the proposed method implemented on FPGA.

Fig. 16 .
Fig. 16.Four possible switching combinations of three-level NPC phase leg.Dark color represents switch-ON, whereas gray color represents switch-OFF.(a) Output positive voltage level.(b) Output neutral voltage level and load current are positive.(c) Output neutral voltage level and load current are negative.(d) Output negative voltage level.

TABLE I SOME
CONVENTIONS OF NONLINEARITIES

TABLE II SWITCHING
PARAMETERS OF IGBT AND DIODE program achieved comparable switching transients accuracy as with SaberRD.

TABLE III HARDWARE
UTILIZATION OF THE FIVE-LEVEL MMC MODELLUT-look-up table; FF-flip-flops; BRAM-block random access memory

TABLE V PARAMETERS
OF THE FIVE-LEVEL MMC TABLE VI PARAMETERS OF THE THREE-LEVEL NPC AND PMSM