# **Programmable multifunctional nanophotonic ICs: architectures, performance and challenges**

Daniel Pérez López<sup>\*a</sup>, José Capmany<sup>a</sup>

a Photonic Research Labs, iTEAM Research Intitute, Universitat Politècnica de València, C. de Vera SN, Valencia, Spain.

# **ABSTRACT**

Programmable multifunctional integrated nanophotonics (PMIN) is a new paradigm that aims at designing common integrated optical hardware configurations, which by suitable programming can implement a variety of functionalities that can be elaborated for basic or more complex operations in many application fields. The strength of PMIN relies on the suitable interconnection of field-programmable waveguide arrays. Here, we review the recent advances reported in the field of PMIN, paying special attention to outlining the design principles, material platforms, synthesis algorithms and practical constraints of these structures. Finally, we discuss their applicability to different fields.

**Keywords:** programmable photonics, multifunctional photonics, software-defined, signal processing, microwave photonics.

# **1. INTRODUCTION**

Microelectronics has become one of the pillars of digital economics in the early XXI century. Current and emerging applications demand information processing at faster speed and bandwidth describing a potential physical limitation of electronic systems. The cooperative use of electronics and photonics is being studied and applied as an appealing direction to overcome future performance limits, leveraging on the best of the two complementary worlds for both digital and analog processing.

In regard to integrated photonics, both research and industrial community have exclusively focused on the design and optimization of Application Specific Photonic Integrated Circuits (ASPICs) during the last years. Here, all the stages involved in the development of a PIC are tailored to optimize the chip performance, power budget, consumption, and footprint 1. This strategy involves the optimization of photonic-based systems through multiple time-consuming cycles of custom design, fabrication, packaging and testing, leading to solutions that are far from being cost-effective for low and moderate volumes. Only very large volumes benefit from economies of scale, but such applications are not there yet, beyond datacenter interconnects and transceivers<sup>2</sup>.

A paradigm shift in PIC design explores the development of programmable circuits. This aims at designing common integrated optical hardware configurations, which by suitable programming can implement a variety of functionalities that can be elaborated for basic or more complex operations in many application fields 3-6. In this respect these systems might compromise the overall power consumption, power budget and footprint to provide an unprecedented degree of flexibility and versatility that is inherently inherited by the programmed systems. This approach enables a new generation of field-programmable PICs that will potentially offer cost-effective and ready-to-use solutions and allow upgradable photonic-based systems<sup>6</sup>. In this paper, we review the proposed architectures and performance with a special focus on the scalability challenges and the analysis of the main limitations.

# **2. SYSTEM ARCHITECTURE**

The new generation of programmable PICs calls for an architecture redefinition to incorporate the required subsystems and their main tasks to become multifunctional and powerful. The system can be divided in three tiers: photonic, electronic and software. These subsystems can be integrated in a monolithic fashion, in a card/PCB level or following a disaggregated approach. Figure 1 (a-c) illustrates the basic system architecture.

\*dperez@iteam.upv.es; http://www.prl.upv.es/



Figure 1. (a)Top-Level description of the basic processing units that build up the Field-Programmable Photonic array (FPPA core and variables employed for the statistical modelling. (b) Implementation and hardware embodiment of the photonic design, (c) Tiers definition and card-integration example of an FPPA. HPB: High-performance block, I/O Input/Outputs.

### **2.1 Photonic tier**

The main task of the photonic tier is to perform the optical signal processing in an environment providing flexibility in both circuit topology and design parameters. Several architectures have been proposed and demonstrated to provide multiple input/multiple output linear transformations  $7-10$ . These rely on a waveguide mesh arrangement that coherently combines the light travelling feedforward. The right combination of the splitting and phase settings of each 2x2 beamsplitter with phase-shift capabilities lead the system to behave as any targeted linear transformation.

This 2x2 Tunable Basic Unit (TBU) have been demonstrated basically relying on the well-known Mach-Zehnder architecture. Two independent phase shifters are allocated on each arm to provide the desired phase and splitting functionality. Please, refer to Section 4 for extended discussion. By replicating the same building block, as depicted in Fig. 1 (a-b), a more versatile architecture is obtained that extend the degrees of freedom, allowing the implementation of optical feedback loops and cavities<sup>3</sup>. As stated in next section, the state of the art includes a design of 30 TBUs<sup>5</sup>

Still in the photonic tier, the combination of a versatile waveguide mesh arrangement with programmable High-Performance Blocks has been proposed giving birth to a Field-Programmable Photonic Array (FPPA)<sup>6</sup>. The HPB are signal processing blocks characterized by enough complexity to be efficiently constructed from TBU primitives. An example of them includes: optical modulation, photodetection, amplification, sources, delay lines, high-performance filtering schemes, as well as (de)multiplexers. In combination with HPBs, the purpose of the waveguide mesh arrangement is two-fold: first, it enables the synthesis of the main optical signal processing architectures and secondly, it provides a flexible interconnection scheme between the reconfigurable optical core, HPBs and input output ports. This approach increases even further the versatility of the device, allowing the designer to decide the order of the components and the circuit topology after fabrication. An example of the photonic tier is illustrated in Figure 1 (b).

Design principles for waveguide mesh arrangements have a direct impact on the targeted frequency and power penalty performance. Since several trade-offs appear, we will address them together with the main challenges and limitations in Section 4.

#### **2.2 Electrical tier**

Versatile waveguide meshes designs involve the capability of allocating as well as independently driving more than 60 phase actuators. Although it is early for predictions, this number of low-speed DC interconnections is expected to grow at least at an initial rate of 30 phase-shifters per year. As soon as the number of phase actuators increase, the electrical wiring between the PCB carrier and the PIC requires a flip-chip approach as opposed to a wire-bonding scheme due to footprint constrains. Apart from the phase shifters, additional actuators and monitoring will be required to ensure the behavior and the configuration of the PIC.

Out of the PIC, the electrical subsystem incorporates a large number of programmable current sources and low-speed monitors. By reducing the amount of electrical power required in the actuators, it is possible to employ electrical ICs and reduce simultaneously the overall power consumption and the footprint of the system. The required number of electrical

sources is in principle related to the number of actuators although several alternatives can be employed based on dynamic current dividers interconnections.

Finally, a Control Unit is required to feed the control signals to the electrical sources and to read the DC monitors outputs. This unit allocates and runs the software tier described in the next subsection.

### **2.3 Software tier**

The strength of any waveguide-mesh based architecture relies on the flexible interconnection scheme of heavily coupled basic units. In a similar fashion that for electronic solid-state transistors, the interconnection between simple units provides the capability for the accomplishment of complex processing tasks. However, the unprecedented number of units that must work together in a cooperative way call for a powerful software toolbox to drive and properly configure each subsystem present in a FPPA. These algorithms cover from the most basic operations involving selfcharacterization, HPB driving and configuration, basic switching between components, to a new class of algorithms that try to make the configuration operation transparent to the user.

This tier allows a reduction in the time for development of PICs, since optimizes a giving hardware without necessarily requiring additional fabrication runs.

### **2.4 Workflow**

Once designed, the FPPA workflow is detailed in a very recent article<sup>6</sup>. It starts with the initial application entry or circuit configuration to be implemented together with the main targeted specifications. These are then processed to optimize the area and performance of the final circuit. Then, specifications are transformed into a compatible circuit of FPPA processing blocks (technology mapping), optimizing attributes such as delay, performance or number of blocks.

The technology mapping phase transforms the optimized network into a circuit that consists of a restricted set of circuit elements (FPPA processing blocks). This is done selecting a set from the available hardware resources and specifying how these will be interconnected. This determines the total number of resources to be activated by programming. In a second stage, the processing block configurations are chosen, and performance calculation and design verification are carried out. The next step assigns each processing block to a specific location in the FPPA core including, as well, the choice of the processing units that route the input signals to the core to the input/s of the programmed circuit and the output/s of the programmed circuit to the core outputs.

The steps contained in the generic design flow can be done automatically either by the software layer, the user, or by a mixture of both, depending on the autonomy and the capabilities of the FPPA software tier. In addition, a failure in any of the steps will require an iterative process till the specifications are accomplished successfully. Additional parallel optimization process (mainly self-winding), enable robust operation, self-healing attributes and additional processing power to the physical device.

# **3. PERFORMANCE AND EXPERIMENTAL DEMONSTRATIONS**

To date, the experimental demonstrations of multipurpose programmable PICs have been restricted to the reconfigurable optical core. They employ bulky discrete current sources and discrete non-integrated lasers, modulators and photodetectors. The software tier has been implemented to show the most basic operators and for analysis purposes, relying on manual operation for the most complex demonstrations. Figure 2 illustrates the first three experimental examples in the field specifying their labelled layout details and fabrication photograph: (by chronological order): Fig. 2(a) shows the programmable optical chip architecture reported by Zhuang and co-workers in silicon nitride 3. Next, we reported the results of a waveguide mesh composed of 7 hexagonal cells (30 thermally-tuned TBUs) fabricated in silicon on insulator, (for more details on fabrication and testing see<sup>4</sup>). Finally, we recently designed a mesh based on thermallytuned 40 TBUs which is shown in Fig. 2(c). In this case, we re-designed the shape of the TBU to achieve a more compact layout and increase the component integration density. This chip has been fabricated in a Multi-Project Wafer run at a fabrication platform at the Centro Nacional de Microelectrónica (CNM platform).

Despite the simplicity of the layouts depicted in Fig.2, even the 7-cell structure is capable of implementing over 100 different circuits for optical filtering applications (basic MZI, FIR transversal filters, basic tunable ring cavities and IIR filters, as well as compound structures such as CROWs and SCISSORs), true time delay lines and optical coherent interferometry<sup>5</sup>.



Figure 2. Chip picture and fabricated layout for different waveguide meshes: (a) Square topology in SiN<sup>4</sup>, (b) Hexagonal topology in Silicon 5, (c) Hexagonal topology in SiN with modified TBU scheme, BUL: Basic Unit Length of the Tunable Basic Unit (defined 3).

### **4. MAIN CHALLENGES AND LIMITATIONS**

The versatility of the multipurpose programmable PICs is directly proportional to the number of resources and primitives contained in the chip. However, the scalability of these systems is limited by different factors: TBU insertion loss, power consumption, optical crosstalk/signal leakage, footprint and the complexity of its control electronics. To fully understand these limits, we focus our analysis in a standalone vision of the TBU and the implication of these non-ideal effects at a circuit level. For the remaining section we will consider a MZI-based TBU architecture.

#### **4.1 TBU level**

The ideal TBU is composed of two 3-dB couplers interconnecting two arms loaded with one phase shifter providing *ϕ*<sup>1</sup> and  $\phi_1$  phase shift, respectively, (see Fig 1 (a)). Ideally, by suitable programming each phase actuator, it is possible to get the desired optical power splitting ratio *K* and impose an additional common phase at the output. However, setting a desired TBU state suffers from several issues that will introduce a drift to the targeted working point, both in amplitude and phase. These mainly arise from the imperfect design or fabrication of the waveguides, the 3-dB couplers and the presence of the phase-shifters. In addition, the tuning-mechanism crosstalk from neighboring TBUs and the drift in the phase-shifter configuration resulting from the electrical driver resolutions or stability will introduce a dynamic source of configuration errors.

To evaluate the performance of the TBU, we simulate the submatrix corresponding to the input/output computational modes given by (1).

$$
H_{TBU} = H_{dc\_Coupler,b}H_{2WG}H_{dc\_Coupler,a} =
$$
\n
$$
= \sqrt{1 - \gamma_a} \sqrt{1 - \gamma_b} e^{-j\omega \tau}.
$$
\n
$$
\begin{bmatrix}\nc_b & -js_b \\
-js_b & c_b\n\end{bmatrix} \begin{bmatrix}\sqrt{1 - \gamma_c} e^{-j\phi_1} & 0 \\
0 & \sqrt{1 - \gamma_d} e^{-j\phi_2}\end{bmatrix} \begin{bmatrix}\nc_a & -js_a \\
-js_a & c_a\n\end{bmatrix}
$$
\n
$$
= \sqrt{1 - \gamma_a} \sqrt{1 - \gamma_b} e^{-j\omega \tau}.
$$
\n
$$
\begin{bmatrix}\nc_a c_b \sqrt{1 - \gamma_c} e^{-j\phi_1} - s_a s_b \sqrt{1 - \gamma_d} e^{-j\phi_2} & -js_a c_b \sqrt{1 - \gamma_c} e^{-j\phi_1} - js_b c_a \sqrt{1 - \gamma_d} e^{-j\phi_2} \\
-js_b c_a \sqrt{1 - \gamma_c} e^{-j\phi_1} - js_a c_b \sqrt{1 - \gamma_d} e^{-j\phi_2} & c_a c_b \sqrt{1 - \gamma_d} e^{-j\phi_2} - s_a s_b \sqrt{1 - \gamma_c} e^{-j\phi_1}\n\end{bmatrix},
$$
\n(1)

where  $\gamma_{a-d}$  describe the loss coefficients of the 3-dB couplers and the interferometric arms,  $c_{a,b} = (1 - K)^{1/2}$  and  $s_{a,b} = -K$  $j(K)^{1/2}$  the coupler coefficients and where K is the 3-dB optical power splitting coefficient. We define the input/output splitting ratios *K* by a Gaussian distribution with mean 50% and a parametrized standard deviation. The excess loss of

the couplers is modelled by a Gaussian distribution with a mean corresponding to 0.05 dB /coupler and standard deviation of 1 %. The loss in the arms comes from the propagation losses in the materials and the presence of the phase shifter. Here, we assume values defined by a non-negative Gaussian distribution with mean 5.16% and standard deviation of  $2.84\%^{10}$ .

To evaluate the static sources of errors (Case A), we perform a Monte-Carlo simulation with a number of samples of  $10<sup>6</sup>$ . Each sample will select the loss- and coupling coefficients from the aforementioned distributions and evaluate the TBU model of Eq. (1). The standard deviation of the couplers is varied from 0 to 5%. The phase shifters values are kept constant at 0 and  $\pi$ , respectively, to set a theoretical bar state. As a result, we obtain the coupling ratio  $(K)$  of the TBU, the outputs phase difference, the cross-port ratio, through-port ratio and the total loss. From this test, we can obtain several conclusions. First, the coupling ratio is shifted from the ideal state zero-coupling state up to a mean of 0.51% when std. of the couplers is set to 5%. Outputs Phase Diff remains equally concentrated at both –  $\pi/2$  and  $\pi/2$ . The most important issue is related to the optical crosstalk, defined by the cross-port ratio. This figure is represented in the blue solid line in Fig. 3. We can see that as we vary the std. deviation of the couplers the optical crosstalk is worsened from - 45 dB to -28.41dB.



Figure 3. Statistical modelling results for Cross-port Ratio<sup>3</sup>. Case A: No phase-errors, couplers mean K = 0.5, σ<sub>K</sub> ranging from 0 to 5%. Case B: Fixed  $\sigma_K$  = 4.33%.  $\sigma_{\phi}$  ranges from 0 to 5%.



Figure 4. Statistical modelling Case B. Example of input distributions for the TBU variables of Eq. (1) and resulting distributions for the overall coupling ratio, phase differences, cross-port ratio, through-port ratio and total loss of a TBU.

For the evaluation of dynamic sources of error, we will fix the std. deviation value of the 3-dB couplers based on reported wafer-scale analysis to 4.33% 10-11. Then, for Case B, we will model the phase shifters as Gaussian distributions of mean equal to the set value (as in case A) but varying their std. deviation from 0 to 5%. As mentioned, this variability will appear due to waveguide fabrication defects, resolution in the subsystem driving the actuators, tuning relatedcrosstalk to cite but a few. The remaining parameters remain in the same conditions that in Case A. In this occasion the

resulting splitting ratio is shifted up to 1.59%. This is translated to a mean cross-port ratio decrement of 8.61 dB as illustrated in Fig. 3. Note that the reported values are relative to the mean values. An example of the variable distributions at the input and the output of the simulations is illustrated in Fig. 4, for the Case B where the std. deviation of the phase shifters is 5%. Output phase difference becomes randomly distributed with more concentration at 0 and  $\pm \pi$ .

Finally, a third simulation considering Case B distributions with phase shifters means equal to 0 and  $\pi/2$ , respectively, obtain that a phase std. variation of 5 % result in a std variation of 1% for the total K variance.

In summary, the design and fabrication effects result in state-of-the-art MZI-based TBU with losses ranging from 0.2 to 0.5-dB, optical crosstalk in the range of -35 to -28 dB that with additional dynamic source of errors becomes inside the - 30 to -18 dB range. The behavior of the TBU as a standalone component is essential to understand the source of errors in the next subsection, where accumulated losses and signal leaking through the mesh will limit the scalability of the circuits.

#### **4.2 Photonic circuit level**

The design and configuration of a waveguide mesh arrangement distribution where a large number of TBUs are replicated involves the consideration of several scaling factors: insertion loss vs temporal resolution and optical crosstalk.

#### 4.2.1 Insertion loss vs temporal resolution.

The waveguide mesh arrangement key functionality for optical signal processing synthesis is both switching and the possibility of programming optical delay lines. These delays are intrinsically discrete and are based on the routing of a signal within the mesh through a configured path of TBUs. Thus, the delay line is given as a discrete number of Basic Unit Delays (BUD). From the previous sentence it is straightforward to understand that the miniaturization of the TBU involves an increment in the temporal resolution of the mesh. If the selected TBU architecture and integration technology have additional sources of loss than the inherent propagation loss, the price of miniaturization comes at the cost of an increment in the loss per delay ratio. For a fixed large-enough delay, one need to go through more TBUs in the waveguide mesh with better resolution (miniaturized TBU).

To evaluate this issue, we assume two cases of fabrication and design issues in a silicon on insulator platform, with group index of 4.18, that result in two levels of TBU loss. For example, we set an optimum or high-quality PIC design and fabrication (HQ-PIC) described by 3-dB coupler loss of 0.05 dB and average propagation loss of 1.5 dB/cm. On the other hand, we have a Low-Quality- PIC design and fabrication (LQ-PIC) described by 3-dB coupler losses of 0.2 dB and 2.5 dB/cm propagation loss. For simplicity we have neglected the phase-shifter additional losses, although they would even exacerbate the conclusions.

Next, we set different designs of TBU where we parametrize the Basic Unit Length for both design & fab processes qualities. As illustrated in Fig. 5(b), the shorter is the TBU, the higher resolution has the waveguide mesh, but it becomes more sensible to TBU losses due to design and fabrication issues. Logically, to set a programmed delay, one should go through more structures if the BUD is shorter. The mesh delay resolution impacts directly on the discrete behavior of the reconfigurability of the programmed interferometric structures. The Free-spectral Range is defined by the interferometric lengths<sup>3</sup>. Figure 5(a) illustrates the maximum FSR in a hexagonal waveguide mesh versus the basic unit delay given by the TBU design, for the synthesis of MZIs and optical cavities.



Figure 5. Trade-off arising from TBU miniaturization: (a) Basic Unit Delay and maximum Free-Spectral Range in hexagonal arrangements, (b) Basic Unit Delay versus delay losses, (c) TBU insertion loss versus TBU lengths in SOI. MZI: Mach-Zehnder Interferometer, ORR: Optical Ring Resonator, LQ- HQ- Low and High quality Photonic Integrated Circuits.

4.2.2. Crosstalk and internal reflections:

For the evaluation of the behavior of the overall circuit, we can use a scalable model that obtains the full scattering matrix, including the wavelength response, given the configuration and characteristics of every TBU in the circuit<sup>12</sup>.

Figure 6 illustrates the schematic and model application for the synthesis of a complex 2D resonant structure (in this case a three stage SCISSOR each one composed of a two cavity CROW. Once configured by optimum configuration, we obtained the transmission and reflection responses shown in Fig. 6(c). We can see the effect of switching off columns of CROWs by modifying the corresponding coupling coefficients to  $K = 0$ . The method generates the corresponding  $40x40$ matrix of transfer functions, each one spanning 1000 wavelengths. The interesting transfer functions are *h*33,13 and *h*38,13, which characterize the transmitted and reflected signals respectively and can be recovered from the main system matrix by using an input vector given by  $I=(i_1,i_2,...i_{40})$  where  $i_k=0$  unless  $k=13$  and  $i_{13}=1$ . The scalable method provides a fast and exact determination of the transfer functions even for this particularly involved structure where both longitudinal and lateral coupling and recirculations are allowed.



Figure 6. (a) Waveguide mesh configuration for the implementation of a three stage SCISSOR where each stage is composed of a second order CROW. Note that both lateral and longitudinal recirculations are allowed in this structure. (b) Circuit layout, (c1) Spectral Response (moduli) for equal value of the resonator coupling constants  $K=0.07$  and switching off the third and both third and second CROW units. (d) Drift performance using non-ideal components.

The effect of employing non-ideal components is illustrated in Fig.6 (d), where the overall TBU coupling K' values are modified following a Gaussian distribution of ideal K mean and std. equal to 1%. The overall common phase is as well modified defined by a std. of 4 %. In this example we simulate a high-quality fabrication process with TBU losses of 0.2 dB. The case can be repeated by feeding the model directly with the distributions in Section 4.2.1 and result in an essential tool for future waveguide mesh designs and to envision the limitations arising from fabrication and design defects.

However, it has been demonstrated that a smart configuration of the waveguide mesh can reduce the internal reflections and crosstalk issues, achieving improvements of more than 20-dB and allowing to relax the TBU specifications<sup>12</sup>.

## **5. CONCLUSIONS**

We presented the system architecture and task description of a new class of multipurpose programmable photonic ICs. By suitably programming and software definition, the cooperation of the electronic control layer and the photonic IC can be programmed for different signal processing tasks. We focused our analysis on the feasibility study under realistic fabrication defects, highlighting the trade-offs and main limitations associated to their present and future scalability. Through a statistical model we predicted that current fabrication procedures provide the required performance in moderately large waveguide mesh arrangements. In addition, the cooperative work between the processing units is essential to relax the specifications during fabrication and overcome most of the future scalability issues.

## **ACKNOWLEDGEMENTS**

The authors acknowledge financial support by the ERC ADG-2016 UMWP-Chip, the Generalitat Valenciana PROMETEO 2017/017 research excellency award, and the COST Action CA16220 EUIMWP.

#### **REFERENCES**

- [1] Soref, R.A., "Silicon photonics technology: past, present, and future", Proc. SPIE 5730, Optoelectronic Integration on Silicon II, (7 March 2005); doi: 10.1117/12.585284;
- [2] Inniss D., and Rubenstein, R., "Silicon Photonics: Fueling the next information revolution," Morgan Kaufmann; 1 edition (2016).
- [3] Pérez, D., Gasulla, I., Capmany, J., and Soref, R.A., "Reconfigurable lattice mesh designs for programmable photonic processors" Optics Express, Vol. 24, no. 11, p.12093, (2016)
- [4] Zhuang, L., Roeloffzen, C. G. H., Hoekman, M., Boller, K-J., and Lowery, A. J., "Programmable photonic signal processor chip for radiofrequency applications," Optica, Vol. 2, no. 10, p. 854 (2015)
- [5] Pérez, D., Gasulla, I., Crudgington, L., Thomson D. J., Khokhar, A. Z., Li K., Cao W., Mashanovich G. Z. and Capmany, J., "Multipurpose silicon photonics signal processor core," Nature Comm., Vol. 8, no. 636, (2017).
- [6] Pérez, D., Gasulla, I., & Capmany, J. "Field-programmable photonic arrays," Optics express, 26(21), (2018). 27265-27278.
- [7] M. Reck M, A. Zeilinger , H.J. Bernstein, P. Bertani, "Experimental realization of any discrete unitary operator." Phys Rev Lett 1994, 73, 58–61.
- [8] Clements, W. R., Humphreys, P. C., Metcalf, B. J., Kolthammer, W. S., and Walmsley. I. A., "Optimal design for universal multiport interferometers." Optica 2016, 3, 1460-5.
- [9] Miller. D.A.B., Self-configuring universal linear optical component. Photonics Research 2013, 1, 1–15.
- [10]Mower, J., Harris, N. C., Steinbrecher, G. R., Lahini, Y., & Englund, D. "High-fidelity quantum state evolution in imperfect photonic integrated circuits," Physical Review A, 92(3), 032322. (2015).
- [11]Mikkelsen, J. C. Sacher, W. D., and Poon J.K.S., "Dimensional variation tolerant silicon-on-insulator directional couplers," Opt. Express 22, 3145-3150 (2014)
- [12]Pérez, D., & Capmany, J.., "Scalable analysis for arbitrary photonic integrated waveguide meshes,". Optica, 6(1), 19-27, (2019)