Combining automated microfluidic experimentation with machine learning for efficient polymerization design

Understanding polymerization reactions has challenges relating to the complexity of the systems, the hazards associated with the reagents, the environmental footprint of the operations and the highly nonlinear topologies of reaction spaces. In this work, we aim to present a new methodology for studying polymerization reactions using machine-learning-assisted automated microchemical reactors. A custom-designed rapidly prototyped microreactor is used in conjunction with automation and in situ infrared thermography for efficient, high-speed experimentation to map the reaction space of a zirconocene polymerization catalyst and obtain fundamental kinetic parameters. Chemical waste is decreased by two orders of magnitude and catalytic discovery is reduced from weeks to hours. Bayesian regularization backpropagation is used in conjunction with kinetic modelling to understand the reaction space and resultant technoeconomic topology. Here, we show that efficient microfluidic technology can be coupled with machine-learning algorithms to obtain high-fidelity datasets on a complex chemical reaction. Finding the best ratio of ingredients for polymerization reactions can be time consuming and wasteful. An automated microreactor process with integrated machine learning analysis initiates reactions, measures the resulting yield and cleans itself without human intervention. It can test concentrations of reagents systematically to find the combination with the highest production, while producing a low amount of waste.

first-principles research problems, including applications in pharmaceuticals, fine chemicals and petrochemicals [28][29][30][31][32] .These recent developments were driven by a desire to perform experiments more quickly, maintain better control over the reaction environment, and reduce waste in chemical research.There have been a number of studies in recent years integrating spectroscopic techniques 33 .Overall analysis and implementation of microsystems has emerged in pharmaceuticals and fine chemicals through robust understanding and reaction design 34,35 .The rapid understanding and reaction design of metallocene catalysts has potentially large implications for the environmental footprint of polymer manufacturing.This acceleration in designing catalytic reactions can be accomplished using spectroscopic microreactors with machine intelligence.
A recently discovered catalyst of interest to academia and industry is (SBI)ZrMe 2 (I) in conjunction with a B(C 6 F 5 ) 3 (II) activator.This combination is active with a broad range of α-olefins, creates polymers with desirable properties, and reduces the reliance on dangerous activators like trimethylaluminium 22,[36][37][38][39][40] .Recently, there have been several papers published about this and related catalyst systems aimed at understanding the kinetics and design of the reactions with the general reaction mechanism shown in Fig. 1 19,22,37,41,42 .Our findings expand on these works through a semi-automated study of the catalyst's behaviour to quickly estimate the reaction space topology along with estimating technoeconomic operating parameters.Here, traditional data analysis and visualization are supplemented by using artificial neural networks (ANNs) as a nonlinear fitting tool to model and predict catalytic behaviour without full knowledge of the model's underlying dependencies and degrees.
This work seeks to address two main challenges: (1) the design of flow-based microsystems for the quick and efficient screening of catalyst for exothermic chemical processes and (2) extraction of the most information possible out of a given set of experiments.The first challenge is addressed through the creation of an integrated continuous-flow microfluidic platform that incorporates pumps, manifolds, controls and analytics into a singular interface that is amenable to automation and integration with analytical techniques.Due to heat and mass transport considerations, flow-based microreactors are challenging to use for such polymerization reactions, necessitating careful design and selection of operating conditions to ensure high data fidelity with minimal transport limitations and safe operation.The second challenge is addressed through the use of rational experimental design to sample an entire experimental space quickly and efficiently while extracting information of interest from a non-invasive and real-time thermal camera.Neural networks were used as a fitting tool to supplement the analysis by modelling and visualizing the behaviour of the experimental space.Overall the system and process presented address current trends spanning chemical engineering and computer science by integrating microscale reactions with automated experimentation and MI-enhanced process understanding.

Microreactor design
Microfluidic platforms are uniquely suited to the study of olefin polymerizations, as these exothermic reactions take place very quickly, have multiple reaction pathways, require precise control, generate large amounts of chemical waste and use expensive and difficult-to-synthesize catalysts.Microfluidics have already been successfully applied to the research of various polymerizations and other exothermic reactions [43][44][45][46] .
The first step in our design of an intelligent microsystem is an order-of-magnitude estimation of heat and mass transport properties and dimensionless numbers, including the Damköhler number (defined as the ratio of reaction rate to mass transport rate, where k is the reaction rate constant, C 0 is the initial concentration, n is the reaction order and τ is the residence time), the beta number (the ratio of heat generated to heat removed, , where r is the reaction rate, ΔH rxn is the heat of reaction, d F is the diameter of the channel, ΔT ad is the adiabatic temperature change and κ is the thermal conductivity) and the Reynolds number (the ratio of inertial forces to viscous forces, Re = ρVL/μ, where ρ is the density of the fluid, V is the velocity, L is the characteristic length and μ is the viscosity).Knowledge of these quantities enables the design of microfluidic devices that offer scalable chemical data mimicking the physics found in industrial-scale processes and enabling visualization.Additionally, this provides context for analysing other properties of interest such as stream mixing, dispersion, heat transfer, mass transfer and the reaction kinetics.Here, the microreactor was designed such that the Damköhler number can be varied between 0.3 and 101, enabling the sampling of both reaction-rate-limited and mass-transport-limited regimes.The beta number varies between O(10 −3 ) and O (10) based on the standard heat of polymerization of 1-hexene (III), indicating that more heat is generated than removed, enabling thermographic analysis 47 .Finally, the Reynolds number varies between O(10 −3 ) and O(1), meaning that the reaction is operating in a laminar flow regime.The final design for the reactor is shown in Fig. 2a.Overall, the analysis and understanding of heat and mass transport characteristics are important to the efficient functioning of the reactor and accurate data collection.

System characterization
The design of an intelligent microreactor involves characterization of the desired operation, both computationally using finite element analysis (FEA) and analytically with methods such as residence time distribution studies with chemical tracers.These analyses help with fully characterizing the system and ensuring that analytical methods will yield reliable results.FEA simulations were carried out, verifying a perceptible difference in infrared radiative flux between a warm reactor channel and the ambient reactor.The results of this simulation are presented in Fig. 2b, showing an ~30 W m −2 difference in flux with a 20 K temperature gradient.Experiments were then carried out to verify the thermal performance of the microreactor.Reagent concentrations taken from previous literature were injected into the reactor and the exotherm was observed 22 .Figure 2c,d presents thermographic images of the reactor, highlighting the development of reactive flow as the reagents mix.The catalyst flows through the far feed channel and the activator flows through the closer one.On contact, a change in radiation is observed, as indicated by the lighter blue colour.It is further observed that the reaction is initiated almost instantaneously when the reagents mix, with the strongest thermal signature in the first centimetre of the channel.
Finally, the microsystem design involves its integration with analytical methods including both in and ex situ measurement techniques.Accurate and precise control along with a diversity of measurable variables are critical to creating an accurate digital twin computer model of the complex polymerization.A challenge presented by a flow-based testing platform is the inclusion of calibrated pumps and manifolds not present in a traditional batch system.Other critical considerations are temperature mapping and homogeneity, which are uniquely difficult in typical flow reactors but are simplified in microsystems.The successful analysis, testing and integration of all these aspects results in a system that provides relevant data quickly and accurately.An overview of the experimental system is summarized in Fig. 3.

Algorithmic concept
The current work seeks to develop a methodology for the testing, design and general understanding of one class of zirconocene-based catalyst by using an ANN fitting.The algorithm, used as a form of supervised machine learning (ML), helps aid in our understanding of the complex kinetics and reaction design for a homogeneous catalysed polymerization.Fitting of traditional numerical models involves understanding the degree of dependence of various independent variables (linear, polynomial, natural exponential and so on), while an ANN can be trained and adapted without complete knowledge of the system.It is, however, still necessary to have a rough understanding of the physical phenomenon to choose activation functions and a number of hidden-layer neurons that are plausible.Recently, we demonstrated that ANNs can be used as a tool for modelling and understanding complex catalytic pathways for polymerization reactions from a first-principles in silico approach 48 .A design of experiments (DoE) strategy employing a Latin hypercube design is used to accurately and randomly sample the multidimensional experimental space 49 .The integration of these steps together compounds the energy, cost and environmental impact savings of using a smaller lab space, with fewer trials, with a fraction of the time and labour.This enables faster data collection and development of models, while reducing the time and amount of chemical waste generated.This methodology serves as a proof of concept for using DoE algorithms with ANN fitting and spectroscopic microfluidics to quickly gain process understanding.
The concept presented herein is that a semi-autonomous reaction system can perform as a machine-assisted chemist to help understand the complex reaction space for a homogeneous metallocene catalyst.The first component of this comprises the control and data interpretation systems that perform the experiments, gather the data and generate the fitted ANN models.An overview of the system used is provided in the top panel of Fig. 4. The process is controlled by a combination of MATLAB and LabVIEW, as each offers certain advantages.MATLAB allows for the development of advanced computational algorithms and includes a robust ML toolbox with different training algorithms.LabVIEW offers a real-time control environment with independent loops running to provide control of the system, including interfacing with external devices.Reagent mixing is provided by electromagnetic diaphragm pumps, flow through the reactor is established by the use of a pressure controller and thermal control is established through Peltier elements.A full description of the system is provided in the Supplementary Methods and page 3 of the Supplementary Information.
Data from the infrared camera were collected using the native camera software at a maximum speed of 24 frames per second and recorded across the reaction zone into a database.Reaction exotherm data were used to interpret the catalytic productivity, mainly the grams of polymer produced per (mole catalyst × mole monomer × hour).Catalytic productivity is an important metric for polymerization catalyst design and is used when discovering new catalysts, understanding existing catalysts and designing industrial plants.Due to the environmental concerns associated with catalyst production and reagent recycling, including GHG emissions and chemical waste, it is important to adjust catalytic performance to meet optimistic goals and government regulations.
By interpolating the data from a minimal number of experiments using a quick and efficient fitting algorithm, it is possible to visualize the full range of the experimental topology, which could not be achieved using traditional trials.

Automated experimentation
Experimentation was conducted in such a way as to establish an understanding of the reaction space without the need for extensive trials.The reagents were manually prepared in an inert environment glovebox and connected to the experimental manifolds.An automated routine was used to establish control, clean the reactor and perform experiments (see flowchart presented in the bottom panel of Fig. 4).The purpose of the algorithm was to efficiently perform experiments in an automated fashion.The next experimental point was selected from a specified monomer, catalyst, activator concentration bank based on the recommendation of a Latin hypercube DoE algorithm.This allows for a nearly random distribution to be sampled across the experimental space, enhancing the robustness of the resultant model.The experimental system consisted of the reagent mixing and storage equipment, thermal control, the reactor and associated control systems (see experimental flowchart in Fig. 3).The reactor and thermal camera were contained within a vacuum enclosure to reduce the effect of atmospheric interference, as air and water vapour contained in it would introduce noise into the data.The reactor used in this study was fabricated by photopolymerization stereolithography and bonded to an infrared-transparent fluoropolymer film (for an overview of the reactor assembly  and dimensions see Fig. 2a; for full information on fabrication see Supplementary Information).
Overall the process and system were designed to handle several challenges, including maintaining an inert environment, ensuring safety, performing experiments automatically and producing scalable data.The chosen catalyst and activator molecules are highly sensitive to moisture and oxygen, and must therefore be handled in such a way that they never come into contact with the atmosphere.Additionally, the solvent and monomer used present safety concerns as they are both toxic and highly flammable, producing vapours that may be explosive.The next challenge in the system design was the integration of robust process automation.The experiment needed to integrate a thermal camera, analytics and fluid handling seamlessly and autonomously.This was accomplished through the combined use of MATLAB and LabVIEW with the integration of an open-source Arduino microcontroller for manifold control.The final challenge in system design was the microreactor itself.The reactor was designed from the 'bottom up' to ensure optimal fluidic and thermal performance supported with both computational and experimental evidence.Overall, the system was designed in such a way as to allow for automated experimentation using sensitive chemistries with in situ analytics integrated with the microreactor platform.

Data collection and analysis
On verifying reactor performance, a Latin hypercube sampling algorithm was used to test the effects of the concentrations of monomer, activator and catalyst, with 29 initial experiments and eight additional trials performed at the random temperature setpoints of 16, 11, 8, 6, 42, 46, 61 and 79 °C.An overview of the experimental concentration used for network training is shown in Fig. 5a with all thermal trials using a low catalyst loading and a high concentration of activator and monomer.It was determined that decoupling anisothermal trials from the anisotonic ones would yield the best quality of data because the experimental effects are not convoluted.If both sets of changing conditions were incorporated into the original Latin hypercube design, it would be difficult to separate which effects are caused by changing concentration and which by changing temperature, resulting in questionable statistical interpretations of the data.In total, these experiments generated (with reactor rinses) less than 30 g of chemical waste, a two to three order of magnitude reduction from traditional experiments.The footprint of the experiment and all associated support equipment was contained to a nine-square-foot area in a fume hood.
The reactor performed as designed, generating a series of exotherms with both dimensional and temporal resolution within the reactor.These exotherms were then used in conjunction with an energy balance for reacting laminar flow in a microchannel to glean information both on the catalytic productivity (a technoeconomic metric) and reaction kinetics (necessary for reactor design and scaleup).Figure 5b shows how the catalytic productivity changes with changing reaction conditions.Finally, a series of ANNs were generated based on the experimental results in an effort to model the catalytic productivity throughout the experimental space.The results of this training in the form of error percentage over the test dataset (15% of the experimental data) are provided in Supplementary Fig. 3, ranging over both training method and the number of hidden-layer neurons.Cross-validation was performed while taking care to keep the training and test data distinct.To test the fit quality of the chosen network architecture, training was repeated five times using random indices for training versus test data; detailed results are provided in Supplementary Table 2.A Bayesian regularization backpropagation training algorithm with a feedforward network consisting of an input layer, two hidden layers with nine neurons each and an output layer resulted in the best quality of fit over the experimental space and was retrained after hyperparameter tuning using MATLAB's automated routines for selecting the training rate parameters and manual selection of hidden layer/neuron numbers.The final network used for the study had a mean square error (MSE) losses of 1.5 × 10 −2 and 6.0 × 10 −5 over the testing and training datasets, respectively.The Bayesian regularization backpropagation algorithm was predicted to have the best representation because it leverages the benefits of both the Levenberg-Marquardt optimization algorithm with robust regularization to enhance prediction stability.Neural networks were used as opposed to polynomial fits due to the ease of adaptability to new experiments and input/output parameters.In the case presented here, the inputs consisted of concentration parameters and temperature, but the architecture of the network can be easily adapted to include other inputs including flow parameters, chemical metrics such as Lewis acidity and adaptations for different reactants.The network can also be updated using reinforcement learning to adapt to different α-olefin polymerization reactions.The network was designed to output the catalytic activity, a technoeconomic parameter used to determine how much polymer is made per quanta of catalyst per quanta of time.Bayesian regularization and normalization were used to provide a stable and scalable fit across a range of conditions.Raw data from the thermal camera were processed through ICI's proprietary libraries, which provided non-uniformity correction and autocalibration.For additional details on network training see the Supplementary Information.

Kinetic and ANN analysis
The first step in kinetic analysis was determining the catalytic productivity and kinetic rate constants for the various experiments.This was performed using automated MATLAB code, which extracted the exotherm from the experimental database and converted it into meaningful data through a heat balance.The raw experimental results are shown in Fig. 5b, where the experimental exotherms are compared with the neural network prediction over all trials.The average percent error of the fit is under 0.5%, with notable deviation being observed in the low-temperature and first high-temperature trials.It is hypothesized that this deviation was caused by adsorption of gelled polymer product from the low-temperature experiment to the reactor walls, which then became desorbed in the first hightemperature trial.Following this deviation, the AI predictions once again lie in near-perfect agreement with the experimental results.Finally Fig. 5c,d shows the kinetic rate constants of chain initiation and propagation, respectively.From this data it can be seen how the rate constants change with varying concentrations and temperature (plots 1-29 and 30-37, respectively).
As the final component of the investigation, a neural network was trained and used to compute the catalytic productivity over a broad range of plausible points from the literature within the reaction space.This methodology presents a way to efficiently visualize the reaction space topology for complex catalytic cycles with a minimum number of experiments.The results of the ANN-based computation are shown in Fig. 6, where the first row shows the reaction space over a range of activator concentrations, the second row over monomer concentrations and the third row over catalyst concentrations.It is observed that the catalytic productivity decreases with increasing catalyst and monomer concentrations, consistent with the mathematical interpretation of the term.This is also consistent with the reaction mechanism, because as the concentration of monomer goes up, the prevalence of vinylene and vinylidene termination mechanisms may increase, and at high catalyst concentrations the activity per gram of catalyst is inherently lower due to the inverse relationship.There is also some nonlinear behaviour at mid-range concentrations of monomer with high activator concentrations, perhaps due to competing branching and termination steps.It is also observed that the productivity tends to increase with increasing activator concentration, consistent with its purpose in the reaction mixture.This presents an interesting technoeconomic problem, which is investigated in the bottom half of Fig. 5, showing the catalytic productivity as a function of monomer concentration and temperature.These data represent a snapshot of 1,000 random concentration combinations over a range of temperatures, giving a randomized decision matrix from which an optimal value can be inferred given desired parameters.Over the experimental space presented here, this value was determined to be a catalyst concentration of 54 mM, a nearly equimolar activator concentration of 56 mM, a monomer concentration of 1.04 M and an operating temperature between 40 and 78 °C.Overall, the ANN analysis allows for the generation of a very large dataset, which can then be used to make decisions about operating points as part of the catalyst discovery process.

Conclusions
Semi-autonomous microfluidic platforms integrated with in situ thermography may be used in conjunction with fitted regularized and normalized neural networks to better understand catalytic cycles for homogeneous polymerization reactions.It has been shown that a metallocene-catalysed polymerization can be performed in a machine-assisted microreactor, and the results can be

Articles
Nature MachiNe iNtelligeNce quantified using an infrared camera, interpreted using an energy balance and used to train an ANN.The ANN may then, in turn, be used to visualize the reaction space with a higher resolution than would otherwise be possible using traditional experimentation.The behaviour of the rate constants of initiation and propagation was also investigated, along with the behaviour of the catalyst at varying temperatures.The reaction space for an industrially and academically relevant homogeneous metallocene polymerization system was investigated and the behaviour towards varying concentrations of reactants was visualized.The combination of these data provides a useful tool for both scientific and technoeconomic analysis of catalyst systems.It was also determined that the Bayesian regularization backpropagation algorithm with nine hidden-layer neurons, used in conjunction with normalization, provides the best quality of fit.It is hypothesized that this result is due to the fact that in Bayesian regularization the training algorithm determines a combination of squared errors and weights that generalizes well.This enables construction of a network that is resilient across a wide range of conditions.
Overall, this study has demonstrated that high-throughput microfluidics can be aided by ML algorithms for the investigation of complex chemical reactions.This opens doors to new types of research, primarily the 'robotic chemist' , increasing throughput, data fidelity and the efficiency of experimental campaigns.Future work could incorporate statistically based Monte Carlo design algorithms to aid in understanding the relationships between parameters.Depending on the use of the experimental results and knowledge of the underlying model, other numerical fitting techniques may also be used.Performing reactions at the microscale with automation reduces the amount of energy input and chemical waste generated, while also increasing safety because any failure in the reaction system is small and contained.Finally, the study aimed to contribute knowledge to online learning for complex systems, as the training methodology could be applied to other chemical, mechanical and electricelectrical systems.
In the future, the methodology presented here may also be used to investigate other catalyst systems where a thermal signature may be expected, including other polymerizations, exothermic biochemical transformations and catalytic breakdown of harmful emissions including SOx and NOx compounds.This experimentation lays the groundwork for the ability to predict optimal catalytic performance conditions autonomously.Optimization of industrially relevant catalysts or catalyst/activator pairs may be accomplished by combining the neural network analysis with optimization techniques such as technoeconomic model minimization or genetic algorithms.Additionally, the high-resolution ANNs can be combined with various topology topological analysis techniques to design new experiments.This AI methodology and the design of machine-assisted microreactors reduces the amount of chemical waste and energy input while enhancing the time to actionable data and the resolution of reaction space topology.
Reporting Summary.Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Fig. 1 |
Fig. 1 | Overview of the chemistry.reaction mechanism for the polymerization of 1-hexene, showing the initiation, propagation and chain transfer steps, based on the work of Moscato and others 22 .Here, k i , k p and k t are the constant rates of initiation, propagation and termination, respectively.

Fig. 2 |
Fig. 2 | Reactor schematic and computational and experimental verification of performance.a, Computer-aided design rendering of the assembled reactor with thermal management and fluid delivery systems.The different components include the reactor, compression O-rings, fluid interface connections, peltier cells for heating/cooling and a liquid cooling block to ensure stable thermal performance.b, FEA simulation of the reactor surface measuring the infrared irradiance (W m −2 ) with a 20 K temperature gradient between the fluid and reactor.c, image of the reactor channel with no flow, taken with the infrared camera.d, image of the reactor with a fully developed reactive flow, taken with the same camera.

Fig. 4 |
Fig.4| Flowcharts for experimental control and data handling.Top: diagram of the system architecture of the high-throughput experimentation system with the functions of MATLAB and LabViEW broken down by control loops and communication protocols.Bottom: flowchart of how experiments are performed, from reagent preparation to a report of the kinetic results.Analysis was performed after the experimental sequence was complete.CAN, controller area network; Ui, user interface.

Fig. 5 |Fig. 6 |
Fig. 5 | experimental results for catalytic productivity and kinetic rate constants.a, Experimental concentrations of the various reagents over the 29 training trials.red circles represent catalyst concentration, blue squares represent activator concentration and green diamonds represent monomer concentration.b, Catalytic productivity (g polymer/(mol(Zr) × mol (1-hexene) × h)) of the SBi(Zr)Me 2 catalyst for the various experiments performed.Blue bars represent catalytic activity and red circles represent the Ai prediction.The error bars represent the differences in analysed exotherm between frames in flow over the course of each experiment (standard deviation over the course of the experiment).Trials 9, 10, 13 and 34 were used for crossvalidation (indicated by the 'T' markings in the figure) and different random combinations were used for additional testing, as shown in Supplementary Table2.c, Natural logarithm of the kinetic rate constant for chain initiation as computed through a nonlinear fit over the various experiments.d, The same as in c, but for the chain propagation rate constant.