dbpRisk: Disinfection By-Product Risk Estimation

. This work describes a new open-source software platform, the dbpRisk software, for conducting simulation experiments in order to model the formation for disinfection by-product in drinking water distribution networks under various conditions and uncertainties. The goal is to identify the risk-level at each node location, contributing in the enhancement of consumer safety. The use of the dbpRisk software is demonstrated using a real water distribution network model from the Nicosia water transport network.


Introduction
The goal of all water supply networks are to provide the consumers with adequate quantity of water, without compromising its quality. From the ancient to the present times, there has always been continuous development of water safety [12]; however, despite the technological progress, the distribution of healthy water is a challenge for all water suppliers worldwide. Water must be safe for human consumption and free from pathogenic micro-organisms, to prevent diseases and health problems [11]. Unfortunately water pollution is the primary cause of deaths and diseases in the world. To eliminate these risks, treatment is required before using the water, so that it is safe for human health [16]; this involves the water disinfection which effectively kills the bacteria and the micro-organisms existing within the bulk water.
Disinfection of drinking water can be considered to be the most important measure of the last century for the protection of public health. The deactivation of bacteria through disinfection has contributed in reducing water-borne diseases. Disinfection can be achieved with chemicals, such as Chlorine, Chloramines, Chlorine Dioxide and Ozone, that destroy the pathogen micro-organisms and produce safe drinking water. A result of disinfection chemical process is the production of Disinfection By-Products (DBP), such as trihalomethanes, haloacetic acids, bromate, and chlorite which can have negative effects on human health.
For instance, disinfectants react with bromides and/or with the natural organic matter that exists in the water source forming DBPs [22]. Also, the formation of DBPs may be due to anthropogenic contaminants which enter the drinking water sources and react with disinfectants [25]. Another parameter affecting the DBP creation is the water age, e.g. due to stagnation. The more time the disinfected water remains unconsumed, the higher the concentration of DBPs is [20]. Generally, the formation and production of DBPs depend on various chemical and environmental parameters, such as the pH, temperature which depend on the seasons, the injected chlorine dosage and the residues throughout the network, the Total Organic Carbon (TOC), the source water quality (e.g. if from desalination or from lake), the bromate concentration and water age [15,25,19,8].
Water distribution networks are responsible for transporting clean water to consumers. These are typically large-scale systems and are comprised of reservoirs, junctions, pumps, valves and pipes transporting and delivering the disinfected water to the consumers, via outflow nodes. From a hydraulics viewpoint, valves and pumps can be controlled automatically or manually, to regulate pressures within the network [3]. From a water quality viewpoint, the goal is to ensure that sufficient quantity of disinfected water is delivered to the consumers, and that a small quantity of disinfected residual is present in the consumption node, in accordance with EU regulations [21].
Chlorine disinfection is used in most drinking water networksdue to its low cost and effectiveness in neutralizing the dangerous micro-organisms under the safety conditions specified by the relevant agencies [21]. Chlorine is injected into the system at specific locations, in a gas-form (Cl 2 ) or as hypochlorite salts (N aOCl). At chlorine residual concentrations 0.03 − 0.06 mg/L, bacteria are deactivated in 20 minutes under normal conditions. Chlorine also reacts with natural organic matter and inorganic substances that exist in water [28]. This reaction is immediate and has as result the creation of chlorination by-products, such as the formation of trihalomethanes [25].
Trihalomethanes (THM) constitute an important category of chlorination by-products and their presences in drinking water is a clue of possible existence of other chlorinated organic compounds at lower concentrations [23]. Trihalomethanes are a group of four chemicals substances that are formed when the chlorine reacts with organic or inorganic matter of water. The trihalomethanes are chloroform, dibromochloromethane, bromodichloro-methane and bromoform [6]. The high concentration of THM has an impact in health, such as liver, kidney and problems in the central nervous system, as well as increased risk of cancer [14]. Depending on the different regulatory bodies, the Total Trihalomethane concentration should be below 0.08 mg/L (US Environmental Protection Agency) or 0.1 mg/L (European Union).
DBPs have been studied extensively during the last 40 years in order to understand and predict their dynamics in drinking water [7]. For instance, studies have utilized linear regression models in order to model the formation of DBPs [4,13]. Other research studies have proposed mechanistic or non-empirical kinetic models describing the formation of DBPs [8,1,17]. Furthermore, some studies have developed models considering the parameters affecting the THM dynamics, such as natural organic matter, initial dosage of chlorine, temperature, pH, total organic carbon, UV254, bromide [15,7,9]. A key issues is to balance the DBP risks associated with high chlorine concentration, versus the microbiological contamination risks associated with low chlorine concentration [25]. A substantial amount of research deals with different species of DBPs and refers to side effects on human health [25].
Water systems modelling is used extensively in research and has found applications in various commercial software. Modelling is mainly focused on the hydraulic dynamics, i.e. calculating the changes of flows and pressures within the distribution network, based on some estimated demands, whereas quality dynamics relate with the change in the concentrations of one or more chemical species. The open-source EPANET software is widely used in the academic community [24], and along with the Multi-Species eXtension Library (EPANET-MSX), complex chemical reactions may be simulated, along with their bulk-water and wall reactions [26].
The use of water quality modelling can be exploited for simulating and evaluating the DBP risk; for this, the EPANET(-MSX) modelling engine will be used. The contribution of this work is the design a software platform, the dbpRisk software, which is able to conduct simulations for modelling DBP formation under various conditions and uncertainties, in order to assign risklevels and assist the decision makers in making more informed decisions for enhancing consumer safety. In addition, the software can be used as module of a Water Quality monitoring and control system which utilizes sensor measurements, simulation and control, to optimize the operation of the system to maintain a high level of water quality while reducing the DBP risk. In particular, the dbpRisk software interface is presented, based on Matlab, and is released under the open-source European Union Public Licence (EUPL) at https://github.com/KIOS-Research/dbpRisk.
In Section 2, the problem formulation is presented, and in Section 3 the dbpRisk software is described. Section 4 presents a case-study on a real water distribution network and Section 5 concludes the paper.

Problem Formulation
This section presents the formulation of the problem for determining the highrisk areas in a water distribution network.

Quality Dynamics
In general, the overall chlorine decay dynamics in water flowing through a distribution system is as follows [5]: The chlorine decay in the bulk water is typically described using a first-order kinetic model, such that [5,2].
where K b is the bulk decay constant, C is the chlorine concentration, T is the temperature in Kelvin, X T OC is the Total Organic Carbon (TOC) concentration in bulk water, and the constants a = 1.8x10 6 L/mg-h and b = 6050 K. Some of the disinfectant substance also reacts with material at the pipe walls where the water moves. The pipe wall reaction rates are needed as well as the mass transfer limitations of the disinfectant from the bulk liquid to the wall. The dynamics for the wall reaction rate within a pipe p are as follows [5]: where K F is the mass transfer coefficient, K w1 is the wall reaction rate constant,K w is the overall wall decay constant and D p is pipe diameter. The mass transfer coefficient K F will in general depend on the flow turbulence as well as the diameter of the pipe. A typical empirical relation for this parameter is where R e is the Reynolds number.
Finally, a first-order chlorine decay dynamics modeling in pipes, neglecting spatial transport dynamics [5], is given by According to various studies, Trihalomethanes kinetics can be modeled as follows [18,27,10]: where the reaction coefficients K b , and the THM yield coefficient β was obtained by simulated distribution system data, in order to agree the results as much as possible to the real system.
Other relater water quality parameters can be modelled, such as the water age (X W A ) and the Total Organic Carbon (TOC) concentration X T OC in thebulk water. Water age can be modelled using zero-order kinetics with a rate constant equal to 1; for example, each second the water becomes a second older [24]. TOC can be modelled following zero-order kinetics with a rate constant equal to 0, when it assumed to remain constant [26]:

Propagation Dynamics
In general, the propagation and reaction dynamics in water distribution networks are described by a set of hyperbolic partial differential equations, which can be discretized using some numerical scheme in order to facilitate computational solutions. Following the formulation in (Eliades and Polycarpou, 2010), let k be the discrete time with ∆t time step, and let the state-space equations describing the substance propagation in a water distribution network segmented into N x finite volume elements to be given by where x(k) is the concentration of all substances in all finite volumes at time k, the state matrix A(k; p x ) is time-varying and depends on the distribution network topology as well as to the hydraulic parameter set p x which affects water flows, such as consumer demands, node elevations, as well as pipe lengths, diameters and roughness coefficients. Function R corresponds to the reaction dynamics of chlorine with organic substances to produce THM, which depends on the hydraulics parameter set p x and the parameter set p c . The parameters p x , p c are in general partially or nominally known, and the uncertainty in the knowledge of these parameters may affect the final solutions. To alleviate this problem, we may consider constructing a number of scenarios with the aim of capturing the variability in the real water distribution network. Let P be the finite set of all the different hydraulic and parameters considered. Each different hydraulic and parameters set corresponds to a scenario, and P is comprised of N p scenarios. The intuition behind using different hydraulic scenarios, is to provide a more robust solution, which may be different from the solution computed if average parameter values were considered. Let N be the number of consumption nodes in the network; for a simulation time k ∈ K, the estimated chlorine and THM concentration at each node is given byŶ where f C and f T are quality dynamic simulators for estimating chlorine and THM concentrations respectively; in practice this is achieved through the use of the EPANET and EPANET-MSX libraries.

DBP Risk Modelling
Eventually, this problem relates to the question "Which areas in a large-scale water distribution network risk a higher disinfection by-product concentration?". Let L ∈ {'blue/low ′ , 'cyan/low−medium ′ , 'green/medium ′ , 'orange/medium− high ′ , 'red/high ′ } be the impact risk colour and level labels; for instance 'blue' corresponds to the lowest risk and 'red' to the highest risk. For N network nodes, let Z ∈ L N be the disinfection by-product risk across all nodes, and let f L : R N → L N be a function that maps the average estimated THM concentration within time K, to an impact metric in L. For the i-th node, the following are considered:Ŷ i T HM (k) ≤ 30 has low impact, 30 <Ŷ i T HM (k) ≤ 60 has lowmedium impact, 60 <Ŷ i T HM (k) ≤ 80 has medium impact, 80 <Ŷ i T HM (k) ≤ 100 has medium-high impact andŶ i T HM (k) > 100 has high impact.

dbpRisk Software
The dbpRisk software is designed with the goal of providing a flexible and userfriendly tool for the academic as well as the professional community, making it easy to evaluate the DBP risk. The software is build upon the EPANET-Matlab-Class 1 , an open development platform which incorporates methods to assist the simulation and the control of water distribution systems, utilizing Matlab's Class structures and the dynamic software libraries of the widely used EPANET engines as well as the EPANET-MSX for simulating multi-species reactions [24,26]. This tool is comprised of a set of functions which are based on the EPANET, along with other useful functions for visualization, simulation and data management. The dbpRisk has been designed in such a way as to a modular architecture for expandability, and its architecture is depicted in Fig. 1. First, the Data Module extracts all the network parameters from the EPANET input file provided, which includes the network topology, pipe lengths and diameters, roughness coefficients, node elevations and demands, characteristics of tanks, valves, pumps, as well as quality parameters. These parameters are stored within an EPANET object which is created in Matlab, that will be used by the other modules. In addition, the water distribution network is plotted in the dbpRisk GUI 2. The Setup Analysis Module allows the user to select the parameter bounds and sampling method, for constructing the scenarios which will be used in the simulation module. These scenarios are stored in the Scenarios file (0-file). These scenarios are simulated in the Quality Analysis Module using the EPANET-MSX library to solve the different hydraulic scenarios, corresponding to the network flows, and then to solve the quality scenarios with respect to the scenarios. Finally, the Results Module calculate the impact risks and depicts this information, along with other graphs and frequency diagrams, on the dbpRisk GUI.

Case Study
The operation of the dbpRisk is demonstrated using on real model, the Nicosia transport network. This network is comprised of 395 junctions, 282 pipes, 2 tanks, 2 reservoirs and 122 valves.
In the following examples, unless otherwise stated, the following parameters are considered : temperature is 30 • C with 5% uncertainty; TOC has concentration X T OC = 1.5mg/L in the main reservoir; the wall reaction coefficient is K w = 0.214 m/day and the parameter that used to model the individual THM formation is β = 33.5mg/L. The network is loaded and the simulation parameters are specified, as in Fig.3.   Fig. 3. The Nicosia water transport network and parameter setup.

Parameters affecting the THM
Depending on the parameter we want to examine, as regards to the effect these might have on the THM concentration, we evaluate different scenarios with simulation duration of 9 days.
Effect of chlorine dose: In the first case, we introduce chlorine 1 mg/L and TOC is X T OC = 2.5 mg/L in the Reservoirs. Water temperature is 20 • C in each pipe and tank. A second scenario examines chlorine injection concentration 0.5 mg/L at the Reservoirs. The results are depicted in Fig. 4 where it is observed, as expected, that THM concentration is higher when chlorine dose is 1 mg/L.

Effect of temperature:
We introduce chlorine of 0.5 mg/L and TOC X T OC = 2.5mg/L in the Reservoirs. We create two scenarios, one with water temperature 20 • C and one with water temperature 35 • C. The results are depicted in Fig. 6 where it is observed that THM concentration is higher when temperature is 35 • C. Another observation is that the variations of each species

Parameter uncertainty
This section investigates the effects of uncertainty with respect to the DBP risk. As in the previous section, the simulation duration is 9 days. By considering uncertainties, it is possible to evaluate the sensitivity of each parameter affected, and to calculate upper and lower bounds of each chemical parameter.
Effect of demand uncertainty: Two scenarios with 0% and 25% demand uncertainty are considered. The reservoirs have X T OC = 1.5mg/L of TOC and 1mg/L of Chlorine, with water temperature 20 • C. Results are shown in Fig. 8.
Effect of roughness coefficients uncertainty: Two scenarios with 0% and 25% roughness coefficient uncertainty are considered. The reservoirs have X T OC = 1.5mg/L and 1 mg/L of Chlorine, with water temperature 20 • C. Results are shown in Fig. 9.   In this work a new open-source software platform, the dbpRisk software, was described. This can be used for conducting simulation experiments in order to model the formation for disinfection by-product in drinking water distribution networks under various conditions and uncertainties, with the goal of identifying the risk-level at each node location and provide assistance to decision makers for making more informed decisions for enhancing consumer safety. A case study was demonstrated using the Nicosia water transport network. Future work will consider more detailed water quality models regarding the DBP formation, and in addition, incorporate epidemiological metrics in order to measure the risk in relation to the affected population.