2D materials–based homogeneous transistor-memory architecture for neuromorphic hardware

Description Memory and logic in the same device Future artificial intelligence applications and data-intensive computations require the development of neuromorphic systems beyond traditional heterogeneous device architectures. Physical separation between a peripheral signal-processing unit and a memory-operating unit is one of the main bottlenecks of heterogeneous architectures, blocking further improvements in efficient resistance matching, energy consumption, and integration compatibility. Tong et al. present a transistor-memory architecture based on a homogeneous tungsten selenide-on-lithium niobate device array (see the Perspective by Rao and Tao). Analog peripheral signal preprocessing and nonvolatile memory were possible within the same device structure, promising diverse neuromorphic functionalities and offering potential improvements in neuromorphic systems on-chip. —YS Homogeneous integration of 2D WSe2 (as a peripheral circuit) on LiNbO3 (as a memory array) can improve neuromorphic architectures. In neuromorphic hardware, peripheral circuits and memories based on heterogeneous devices are generally physically separated. Thus, exploration of homogeneous devices for these components is key for improving module integration and resistance matching. Inspired by the ferroelectric proximity effect on two-dimensional (2D) materials, we present a tungsten diselenide–on–lithium niobate cascaded architecture as a basic device that functions as a nonlinear transistor, assisting the design of operational amplifiers for analog signal processing (ASP). This device also functions as a nonvolatile memory cell, achieving memory operating (MO) functionality. On the basis of this homogeneous architecture, we also investigated an ASP-MO integrated system for binary classification and the design of ternary content-addressable memory for potential use in neuromorphic hardware.

In neuromorphic hardware, peripheral circuits and memories based on heterogeneous devices are generally physically separated. Thus, exploration of homogeneous devices for these components is key for improving module integration and resistance matching. Inspired by the ferroelectric proximity effect on two-dimensional (2D) materials, we present a tungsten diselenide-on-lithium niobate cascaded architecture as a basic device that functions as a nonlinear transistor, assisting the design of operational amplifiers for analog signal processing (ASP). This device also functions as a nonvolatile memory cell, achieving memory operating (MO) functionality. On the basis of this homogeneous architecture, we also investigated an ASP-MO integrated system for binary classification and the design of ternary content-addressable memory for potential use in neuromorphic hardware. R ecently proposed diverse neuron-inspired hardware based on various emerging nanomaterials has effectively advanced neural networks (1)(2)(3)(4)(5)(6), especially for two-dimensional (2D) materials. 2D materials can provide a platform to develop transistor architectures for memory operating (MO)-including field-effect transistors (FETs), tunneling transistors, junction transistors, ferroelectric (FE) transistors, and ferromagnetic transistors-owing to their rich electrostatic control capabilities (1,7). 2D materials-based FE FETs, FE bipolar junction transistors (BJTs), and FE tunneling transistors exhibit large on-off resistance ratios, fast operation, low power consumption, nonvolatile electronic control, and weight updates under reversible polarization (1) because of the strong proximal coupling of FE materials with 2D materials. Therefore, these 2D materialsbased FE proximal coupled devices are being intensively investigated for neuromorphic computing (1,4), in which they are used as memories by dynamically modulating the FE polarization to program the conductivity of superjacent 2D channels (8,9). Achieving computing tasks requires these memories to be integrated with peripheral circuits, because analog signal processing (ASP) is essential before and after MO (2). However, peripheral circuits are generally based on complementary metal-oxide semiconductor (CMOS) transistors, and thus the heterogeneous architectures between memory cells and peripheral circuits lead to their physical separation, making it necessary to consider module integration compatibility issues for chip design (2). In addition, an emerging challenge regards how to achieve efficient resistance matching between heterogeneous device architectures as device dimensions are scaling down, which may hinder the pursuit of higher performance and energy efficiency (10). Therefore, it is crucial to explore the integration between ASP and MO.
Designing an ASP-MO integrated system with a homogeneous device architecture for peripheral circuits and memory cells offers the potential to relieve the above-mentioned issues, which can also be realized by the mechanism of 2D-FE proximal coupling. On the one hand, FE polarization proximity-induced nonvolatile electronic gating in 2D materials enables the design of nonlinear transistors, including p-n diodes and BJTs (8,(11)(12)(13)(14)(15)(16). On the other hand, FE polarization can modulate the built-in potential in BJTs (17), enabling nonvolatile memory functionalities with an improved on-off resistance ratio. Moreover, the reconfigurable FE polarization domains can seamlessly manipulate arrayed doping domains in 2D materials, showing potential for the fabrication of massive cascaded devices with enhanced compactness. Therefore, a homogeneous, 2D material-based FE proximal coupled BJT architecture is proposed to design peripheral circuits for ASP, as well as nonvolatile memory cells for MO, enabling development of an ASP-MO integrated system.
In this work, seamlessly arrayed periodically polarized LiNbO 3 (LNO) domains formed a grating-like structure ( fig. S1), which effectively tailored the WSe 2 channels into seamlessly arrayed junctions. ASP, MO, and their corresponding cascade were investigated to demonstrate the success of an integrated system based on the same device architecture, which included a WSe 2 channel crossing three FE domains. Our operational amplifier (OPAMP) was designed for ASP (18), and memory cells with encoded synapse weights were cascaded with the OPAMP to demonstrate the applicability in binary classification (19). Furthermore, ternary content-addressable memory (TCAM) with a two-transistor-two-resistor (2T2R) configuration was designed with the homogeneous transistor-memory architecture, yielding a ratio of 898.4 between the high-resistance state (HRS) and the low-resistance state (LRS). Such an integrated system architecture could provide a feasible approach to solve the heterogeneous issue and improve neuromorphic applications.
Few-layer WSe 2 flakes were exfoliated and transferred onto LNO to demonstrate the reconfigurable electronic functionalities under FE proximal coupling (WSe 2 characterization and optical images are shown in fig. S1). The basic device functioned as a nonlinear transistor when the domain polarization state was fixed under zero gate voltage (V g ) (Fig. 1A), as the FE proximal coupling induces carrier doping in the WSe 2 (Fig. 1B). This doping mechanism was indicated by the Kelvin probe force microscopy (KPFM) mapping shown in fig. S1C, where the higher (or lower) surface potential induced by the polarization-down (P d ) [or polarization-up (P u )] domain was in accordance with the p-doping (or n-doping) nature (8). Transfer curves for the intrinsic WSe 2 and FE-doped WSe 2 FETs are shown in fig. S1L. The neutral point shifts were 3.8 and −6.2 V for the P d and P u domains, respectively, corresponding to a hole doping density of 2.07 × 10 12 cm −2 and an electron doping density of 3.37 × 10 12 cm −2 . This doping character induced a built-in potential of~0.43 eV and a depletion width (w) of 48.25 nm (fig. S1M) (20). For a basic n-p-n BJT on the P u -P d -P u domains, current amplifications were measured under the base, collector, and emitter voltage control. The common-base configuration in Fig. 1E showed an average gain of a ¼ Ic Ie ¼ 0:979 (I c , collector current; I e , emitter current) for the active region, and the common-emitter configuration in Fig. 1F showed a maximum gain of b ¼ Ic Ib ¼ 11:2 (I b , base current), offering the capability to design analog circuits. More details about signal rectification, amplification, and performance uniformity are shown in figs. S2 to S5.
The basic device could also operate as a nonvolatile memory (Fig. 1C), the mechanism of which differs from those of conventional FE-FET and MemFlash (1, 4, 21). The FE polarization state was fixed to be P u for the collector and emitter, and the HRS and LRS were dominated by the built-in potential (17), which was controlled by V g for the base. In the potentiation (or depression) process, a positive V g from 6 to 9 V (or a negative V g from −6 to −9 V) changed the FE polarization state for the base to P u (or P d ), resulting in a low built-in potential with enhanced channel conductance (or a high built-in potential with reduced channel conductance) (Fig. 1D) (21,22). The mechanism was indicated from  common-base (E) and common-emitter (F) configuration. (G) Current rectification for the potentiation process, for which V g ranged from 6 to 9 V with a step of 0.5 V. (H) Current rectification for the depression process, for which V g ranged from −6 to −9 V with a step of 0.5 V. (I) Channel resistance for the FE polarization state of P u -P d -P u and P u -P u -P u under the driving voltage of V ce = −3 V and V be = 0 V. The inset shows a schematic of the measured current. (J) Conductance update for the potentiation and depression with a V g step of 0.1 V and a width of 1 s. The drive voltages were V ce = 1 V and V be = 0 V. the modulated current rectifications under the collector-emitter voltage (V ce ) sweep for the potentiation (Fig. 1G) and depression (Fig. 1H) processes, respectively [base-emitter voltage (V be ) = 0 V; V g step of 0.5 V]. This rectification behavior can provide a high resistance ratio between the HRS and LRS. In contrast to the conventional FE-FET structure with the WSe 2 channel placed on a single FE domain ( fig. S6), the resistance for the FE polarization state of P u -P d -P u and P u -P u -P u showed an average resistance ratio of~10 6 under driving voltages of V ce = −3 V, V be = 0 V, and V g = 0 V (Fig. 1I). Moreover, as an artificial synapse, the conductance update for the potentiation or depression process with a V g step of 0.1 V under V ce = 1 V and V be = 0 V (Fig. 1J) was encoded as a synapse weight ( fig. S7A), and the on-off resistance ratio was~10 3 . The stable memory performance is shown in fig.  S7 for measurements of multiple cycles, and the memory performances were compared with those from other studies of diverse memory architectures (    film grown by chemical vapor deposition was then transferred and patterned on LNO to fabricate an ASP-MO integrated system. The OPAMP was designed by cascading multiple basic devices for ASP (Fig. 2, A to C) (18), with the input electrodes designated X1 and X2. The dc driving voltages were col-lector voltage (V cc ) = emitter voltage (V ee ) = 6 V, the resistances were R 1 = R 2 = 10 megohms, the capacitance (C) was 1 mF, and the input bias current was varied between 30 and 50 nA to adjust the performance. The OPAMP characteristics were discussed and compared with those of previously reported CMOS-based de-vices (table S2). We applied an ac input signal V i ¼ 0:2sin200p Â t (t, time) (Fig. 2D) at only one input port, and the other port was grounded. The output signal V À o was inverted when the signal was input at port X1 (Fig. 2E), the output V þ o was noninverted when the signal was input at port X2 (Fig. 2F)  voltage gain was related to the ratio R f /R 3 . For the addition operation, port X2 was grounded, and both input signals were cascaded to port X1, with a resistance (R f ) for negative feedback. The two ac input signals V 1 i ¼ 0:2sin200p Â t and V 2 i ¼ 0:3sin200p Â t and the output signal V o are shown in Fig. 2G. Under the condition of R f = R 4 = R 5 = 2 megohms, the output signal wasÀV o ¼ V 1 i þ V 2 i . When a capacitance was used for negative feedback, the OPAMP achieved the integral calculation with a time constant of t ¼ R 6 C ¼ 2 s (R 6 = 2 megohms, C = 1 mF). The square input signal V i and the corresponding integrated output signal V o ¼ À Vi RC t are shown in Fig. 2H. The triangleshaped input signal and the approximately sine wave-shaped integrated output signal are shown in Fig. 2I.
On the basis of the reconfigurable functionalities, the ASP-MO integrated system was used to demonstrate the proof of principle for binary classification (23,24). The input three pixel-by-three pixel patterns for the letters H, U, S, and T were labeled as class 1, and those for the numbers 2, 0, and 1 were labeled as class 0 (Fig. 3A shows eight randomly selected patterns; and the entire training dataset is shown in fig. S8). The calculation schematic and circuit diagram are shown in Fig. 3, B and C, respectively. The pixels were coded into nine input voltages (V ce ) to calculate the weighted average current (Eq. 1) in nine memory cells. The weighted average current was converted into a voltage Score(V) in the transimpedance amplifier (TIA) (19, 25), with a resistance (R 0 ) of 10 megohms. The non-linear output voltage (Eq. 2) was processed by the voltage comparator (VC) with a reference voltage (V ref ) of 1.5 V, which was similar to the sigmoid function (Fig. 3D) The drive voltages were V cc (6 V) and V ee (0 V), and the conductance was updated after calculating the cross-entropy cost for each batch, as summarized in fig. S9, C and D. The system was trained for 30 epochs, after which the 5 of 6 Data: 1 Data: 0 Data: X Read bias signal (count)   pattern label and predicted classes in the simulations and experiments were elucidated (Fig. 3, E to G, respectively). Score(V) is summarized in fig. S9, E and F; the average Score(V) values of classes 0 and 1, which are separated by V ref , are depicted in Fig. 3H. The accuracy and cost are shown in Fig. 3, I and J. The performance was mainly limited by the output characteristics of the VC and the non-negative weight in the hardware. Analogous to conventional FE memristors, this ASP-MO integrated system could be useful for other neuromorphic algorithms, although such investigation is beyond the scope of this work. TCAM is promising for the parallel search of massive datasets in in-memory computing (26,27), and TCAM cells with a 2T2R configuration can be constructed on the basis of homogeneous transistor-memory architecture (schematics of the mechanism are shown in Fig. 4, A to F). In our design, the TCAM cell included three WSe 2 channels on nine separated FE domains, with four terminals marked as the emitter electrode (X e ), read electrode (X r ), and base electrodes (X b 1 and X b 2 ). Three central domains, designated D1 to D3, formed two memories, with memory R1 across D1 and D2 and memory R2 across D2 and D3. The polarization state of D2 was fixed at P u ; thus, R1 (or R2) was in an HRS when the V g of −9 V changed D1 (or D3) to P d and was in an LRS when the V g of 9 V changed D1 (or D3) to P u . Moreover, R1 and R2 were connected with two BJT switches, which were switched on (or off) at a base voltage of (Fig. 4, B, D, and F). X e was grounded, and V r (5 V) was loaded at X r for the address search function.
The bit data 1 (or 0) in the TCAM cell was defined when R1 was in the HRS and R2 was in the LRS (or R1 in the LRS and R2 in the HRS). The matched state exhibited a low conductance for the address read at X r and thus bit data 1 was searched by switching on BJT1 and switching off BJT2 (Fig. 4, A and B). Bit data 0 was searched by switching off BJT1 and switching on BJT2 (Fig. 4, C and D). Otherwise, the TCAM cell was in the mismatched state with a high conductance. For bit data 1, the average conductance was~0.5 nS (where 1 S = 1 A/V) for the matched state and 449.5 nS for the mismatched state, with an average conductance ratio r c ¼ Chigh Clow ¼ 899 (Fig. 4H). The TCAM cell with an unknown bit (designated bit X) was always in the matched state, because both R1 and R2 were in the HRS (Fig. 4, E and F), with an average conductivity of~0.4 nS (Fig. 4I). The cumulative probability was analyzed in fig. S10.
In addition to binary classification and TCAM design, the basic device should be suitable for various applications, including digital computing (fig. S11) and artificial neural systems with optical sensing abilities (fig. S12) (28,29). The basic device should also be applicable for the design of analog-to-digital converters, digital-to-analog converters, and analog filters (2). The device size can be reduced by scaling down the polarization domain size, which will improve the current gain and integration density. A lower coercive voltage would be helpful for achieving a lower power consumption, which can be attained by reducing the LNO thickness or using other FE materials. Finally, inspired by conventional CMOS-based chips, we have proposed a neuromorphic ASP-MO 3D stacking system derived from 2D integration (fig. S13). The main challenge lies in the growth of wafer-scale, high-quality 2D materials, and recent works have achieved promising breakthroughs to overcome this challenge (30). Thus, this homogeneous transistor-memory architecture will help to promote analogous neuromorphic systems on-chip.