Fixed bed adsorption of water and air contaminants: analysis of breakthrough curves using probability distribution functions

Abstract Simple fixed bed models such as the Bohart–Adams and Thomas equations are often used to describe the breakthrough characteristics of water and air contaminant adsorption in fixed bed adsorbers. However, these popular models are confined to correlating highly symmetric breakthrough curves. The present study explores the feasibility of using two probability distributions (normal and log-normal) to track asymmetric breakthrough curves for water and air contaminant adsorption in fixed bed columns (ciprofloxacin, ammonium, hydrogen chloride, and hydrogen sulfide). The normal and log-normal probability distributions provided accurate fits to the slightly asymmetric ciprofloxacin breakthrough curve. They also provided a good representation of the overall shape of the ammonium breakthrough curve, but failed to describe the leakage of ammonium during the initial period of column operation. The log-normal distribution was able to match the asymmetric HCI and H2S breakthrough curves. The normal distribution, by comparison, failed to describe these two asymmetric breakthrough curves. Because the log-normal distribution has a floating inflection point, it was very effective in describing the asymmetric HCI and H2S breakthrough curves that were sharp initially and then broadened significantly as the column approached saturation. The ability of the normal distribution to track such asymmetric breakthrough data was impeded by its invariant inflection point.


Introduction
The modeling of pollutant adsorption in fixed bed adsorbers is of continuing interest to researchers working in the domain of water and air decontamination. In this field of research, the bulk of experimental results collected in the form of breakthrough data are invariably processed using mathematical models. Mechanism-driven models are useful for exploring the fundamental aspects of fixed bed adsorption. The parameters of such models are often calibrated using information obtained from independent sources and/ or engineering correlations. For example, smallscale batch experiments are commonly used to measure equilibrium and mass transport parameters. These parameters can also be extracted from breakthrough data collected from "microcolumn" adsorbers (Weber and Liu 1980;Weber and Wang 1987). The parameters so obtained are plugged into an appropriate mechanistic model to predict the dynamic behavior of laboratory-or pilot-scale fixed bed adsorbers. A variety of such models, originally developed by chemical engineers (Ruthven 1984), have been tested in the environmental adsorption literature. Because these models are made up of coupled partial differential equations, specialized software and programming code are required for their numerical solution.
Besides mechanism-driven models, empirical or data-driven models are also used to describe contaminant breakthrough curves. The terms "big data" and "data science" are becoming pervasive, affecting many aspects of environmental remediation research (Hering 2019;Newhart et al. 2019). Large amounts of data are being used to develop optimization and process control strategies. In the context of fixed bed adsorption, "small data" sets are used to construct empirical models. Commonly used methods to build empirical models from breakthrough data include the design of experiments and neural network modeling approaches (Dalhat et al. 2021;Schio et al. 2021). These data-driven models may be used for process design or optimization, but they lack mechanistic merit and should not be used to predict breakthrough behavior outside the range of experimental conditions used for model calibration.
Another empirical modeling approach involves the use of simple phenomenological models such as the Bohart-Adams and Thomas equations (Bohart and Adams 1920;Thomas 1944). These two models consider adsorption equilibrium, but their rate mechanisms are based on simple reaction kinetics. Although unrealistic, the assumption of reaction kinetics as the rate-limiting step simplifies the mathematical treatment of fixed bed adsorption. Another simple model is the Yoon-Nelson equation (Yoon and Nelson 1984), which often appears together with the Bohart-Adams and Thomas equations in fixed bed modeling studies. The Yoon-Nelson equation is highly empirical since it is devoid of typical fixed bed variables such as flow rate and bed length. These simple analytical models permit modeling work to be conducted using standard spreadsheet programs.
A deficiency of the Bohart-Adams, Thomas, and Yoon-Nelson equations is that they cannot fit asymmetric breakthrough curves to a significant degree of precision. From a practical perspective, this is a serious limitation because a poorly fitted model is likely to predict grossly inaccurate estimates of breakthrough and exhaustion times. Limited efforts have been made to enhance the data fitting abilities of the three models, as summarized by Apiratikul and Chu (2021).
Given the inadequacy of the Bohart-Adams, Thomas, and Yoon-Nelson equations, there is a need to develop more effective models with simple mathematical forms. Probability distribution functions are a potential source of such models. This is because cumulative probability distribution functions can generate sigmoid or S-shaped curves, which are similar to the shapes of fixed bed breakthrough curves. Also, straightforward data analysis by spreadsheet calculation is possible because many probability distributions have simple mathematical forms. For example, the normal probability distribution (Dima et al. 2020;Huynh et al. 2021) and the Weibull probability distribution (Chu 2021) have been used to correlate water contaminant breakthrough curves.
Although the normal probability distribution is a relatively simple equation, its data fitting ability is somewhat limited, as this paper will demonstrate. The purpose of the present study is to highlight the logarithmic normal probability distribution as a more effective alternative to the normal probability distribution in the correlation of contaminant breakthrough data. It is somewhat surprising to note that the log-normal probability distribution, which is closely related to the normal probability distribution, has attracted hardly any attention. So far as the authors know, there are no reports of the application of the lognormal probability distribution to fixed bed adsorption of water and air contaminants. In this work, the relative performance of the normal and log-normal probability distributions will be illustrated using previously published breakthrough data for two water contaminants (ciprofloxacin and ammonium) and two air contaminants (hydrogen chloride and hydrogen sulfide).

Probability distributions
The normal distribution This distribution, sometimes referred to as the Gaussian distribution, occurs throughout statistics. Traditionally, the mean and the standard deviation of a random variable are used to specify a normal distribution. For a continuous random variable x with a 1 as the mean and b 1 as the standard deviation, the cumulative distribution function (F(x)) for the normal distribution is given by Equation (1).
It is straightforward to convert Equation (1) to a breakthrough curve model. Since x is a continuous variable, it can be replaced by the variable t, the operation time of a fixed bed adsorber. Since F(x) varies from 0 to 1, it can be replaced by the dimensionless effluent concentration, c/c 0 , where c is the effluent concentration at any time t and c 0 is the feed concentration. The resulting expression is given by Equation (2), where a 1 and b 1 are treated as fitting parameters. An expression similar to Equation (2) was used by Neufeld and Thodos (1969) to correlate the breakthrough curves of phosphate obtained from a fixed bed column packed with activated alumina.
The normal distribution defined by Equation (2) predicts a sigmoid curve, whose shape is determined by its inflection point. An inflection point is where a curve changes concavity; that is, it is a point at which a curve goes from concave up to concave down, or vice versa. For the normal distribution, the location of its inflection point can be determined by finding at what time the second derivative of the function is zero. The second derivative of Equation (2) is given by Equation (3).
By equating the preceding expression to zero and solving the resulting equation for t, we obtain t ¼ a 1 at the inflection point. Substitution of this t value into Equation (2) yields Equation (4), which defines the location of the inflection point. According to Equation (4), a sigmoid curve predicted by the normal distribution will have its inflection point located at the midpoint.

The log-normal distribution
This distribution is a modification of the normal distribution wherein t in Equation (2) is replaced by a log-transformed t, i.e., ln(t). Since one can take the logarithm only of dimensionless numbers, the variable t, which has units of time, must be made dimensionless. It follows that both a 1 and b 1 in Equation (2) must also be dimensionless. Note that the log-normal distribution should have its own unique constants, so a 1 and b 1 are hereafter called a 2 and b 2 . Some mathematical manipulation is needed to make t, a 2 , and b 2 dimensionless. First, we assume that all three entities are expressed in units of min. Next, we multiply the three entities by t Ã /t Ã where t Ã ¼ 1 min. We now have t Ã (t/t Ã ), t Ã (a 2 /t Ã ), and t Ã (b 2 /t Ã ) with the bracketed terms being dimensionless quantities. Finally, we take the logarithm of (t/t Ã ), obtaining t Ã ln(t/t Ã ). Equation (2) can now be rewritten as Equation (5), which is then simplified to Equation (6).
The preceding expression is the log-normal distribution with dimensionless quantities within the argument of the error function. Since t Ã is numerically equal to unity, it may be omitted from Equation (6): The last result is still the log-normal distribution with the dimensionless t and two dimensionless fitting parameters, a 2 and b 2 . Multiplying the fitted values of the dimensionless parameters a 2 and b 2 by t Ã , which is equal to 1 min, recovers the dimensioned parameters a 2 and b 2 .
The second derivative of Equation (7) is given by Equation (8). Setting Equation (8) to zero and solving for ln(t) leads to ln ðtÞ ¼ a 2 À b 2 2 : Substitution of the last result into Equation (7) yields Equation (9), which indicates that the location of the inflection point varies with b 2 . The log-normal distribution therefore has a variable or floating inflection point.
Nonlinear least-squares regression Nonlinear least-squares regression was applied to the two probability distributions to estimate their free parameters. The overall fit of a distribution was measured with the coefficient of determination R 2 and the residual root mean square error (RRMSE). It is not necessary to use other statistical indicators such as the Akaike Information Criterion for model comparison because the two distributions contain the same number of fitting parameters. The R 2 and RRMSE metrics are given by Equations (10) and (11), respectively, where w is the number of data points, M j are model values of c/c 0 , E j are experimental values of c/c 0 , E m is the mean of the observed data, and p is the number of adjustable parameters. Note that the R 2 metric is traditionally defined only for linear relationships. Nonetheless, it is commonly used as an informal measure of the goodness of a nonlinear fit.

Result and discussion
Four sets of previously published breakthrough data for water and air contaminant adsorption in fixed bed columns are used to evaluate and compare the data fitting abilities of the normal and log-normal distributions. To provide rigorous comparisons, the data sets were carefully selected to reflect different curve characteristics. The first data set, reported by Sausen et al. (2018), exhibits a slightly asymmetric curve shape. The second one, taken from the work of Nguyen et al. (2019), is characterized by a leakage of the contaminant at the column exit. The third and fourth data sets, reported by Papurello et al. (2019), display different degrees of breakthrough curve asymmetry.
Case 1: ciprofloxacin breakthrough curve In this case study, a series of column experiments were performed to measure the breakthrough curves of ciprofloxacin as functions of bed height and flow rate (Sausen et al. 2018). Ciprofloxacin, a widely used antibiotic, has been detected in wastewater treatment plant effluents and surface waters. The fixed bed experiments were conducted in a glass column with an internal diameter of 1 cm and a length of 30 cm. The column was packed with a commercial cation exchanger with an average size of 0.653 mm. Figure 1 shows the selected data set, obtained with the following experimental conditions: feed concentration ¼ 100 mg L À1 , flow rate ¼ 5 cm 3 min À1 , and packed bed length ¼ 7.6 cm. Figure 1 plots the fitted curves of the normal and log-normal distributions, calculated using the best-fit parameters listed in Table 1, case 1. Visual inspection of Figure 1 indicates that the measured data are well represented by the normal and log-normal distributions, with the former fitting the initial stages marginally better than the latter. The values of R 2 and RRMSE for the two fits, presented in Table 1, case 1, support this observation. There are trivial differences in the fit statistics, indicating that the two distributions are similar in performance.
Because of the highly symmetric nature of the measured breakthrough curve, a probability distribution with an inflection point close to the midpoint is expected to perform well. The inflection point of the normal distribution is always located at the midpoint, while that of the log- normal calculated from Equation (8) is located at c/c 0 ¼ 0.41, which is fairly close to the midpoint. It is therefore not surprising that the two distributions can handle the slightly asymmetric ciprofloxacin breakthrough curve rather well.

Case 2: ammonium breakthrough curve
This case reports the removal of ammonium by corncob biochar packed in a glass column with an internal diameter of 1.5 cm and a total column length of 50 cm (Nguyen et al. 2019). The effects of initial concentration, flow rate, and packed bed length on the breakthrough characteristics of ammonium were investigated. The selected data set was obtained with a feed concentration of 10 mg L À1 , a flow rate of 1 cm 3 min À1 , and a packed length of 8 cm. Figure 2 presents the measured data and plots the fits of the two distributions. The fit statistics and parameter estimates are presented in Table 1, case 2. The two fitted curves in Figure 2 highlight the fact that there is little difference in the performances of the two distributions. The initial stages of the breakthrough curve are poorly represented by the two distributions, as can be seen in Figure  2. However, the R 2 scores for the two fits are very impressive (R 2 > 0.99) because most of the data points above the initial stages of breakthrough are well tracked by the two distributions. Overall fit indicators such as the R 2 metric do not always tell the whole picture. When assessing goodness-of-fit, it is inappropriate to trumpet high R 2 scores without first examining a graph of the data together with the model fit. Figure 2 reveals the presence of a fixed bed phenomenon known as "fronting/leakage," which refers to the appearance of a noticeable effluent concentration level soon after the initiation of column operation. This nonzero effluent concentration level remains relatively stable for a period of time before the breakthrough curve rises rapidly. If the desired breakthrough concentration is comparable to the level of non-zero effluent concentration, the leakage phenomenon will lead to a premature breakthrough time and consequently result in severe underutilization of the column capacity. There are several reasons for the occurrence of leakage, with non-uniform flow being one. The problem of non-uniform flow is most likely due to poor column packing. The use of adsorbents with irregular particle shapes or compressible adsorbent materials is rather common in this field of research. It is not easy to pack a column well with adsorbents of this type. Since fronting breakthrough curves are undesirable, there is little point in developing models that can track such curve shapes. Efforts should instead focus on eliminating the fronting phenomenon by optimizing the experimental conditions and equipment.

Case 3: hydrogen chloride breakthrough curve
This example, taken from the work of Papurello et al. (2019), reports the breakthrough characteristics of a simulated biogas stream containing hydrogen chloride and hydrogen sulfide in glass filters packed with two types of commercial activated carbon (Norit RST3 and Airdep Carbox) and a biochar material produced by wood gasification. The adsorption experiments were conducted using adsorbent particles in the size range of 0.1-0.18 mm and a gas hourly space velocity of 1636 h À1 . Feed concentrations of HCl in the range of 10-377 ppm(v) and of H 2 S in the range of 10-200 ppm(v) were used. Figure 3 presents a set of HCl breakthrough data obtained with a biochar filter subjected to a feed concentration of 377 ppm(v). Also shown in Figure 3 are the two probability distribution fits, calculated using the parameter estimates presented in Table 1, case 3. The observed breakthrough curve depicted in Figure 3 is different from those shown in Figures 1 and 2. It is asymmetric in shape and characterized by immediate HCl breakthrough, followed by a sharp rise in the exit concentration, and finally by a long period of filter saturation. This curve shape is known as "tailing." Note that the initial section of the experimental profile exhibits a very slight curve curvature, which makes the breakthrough curve look like a hyperbolic curve instead of a sigmoid curve. As Figure  3B shows, the asymmetric shape of the observed breakthrough curve is well represented by the lognormal distribution (R 2 ¼ 0.999). By contrast, the agreement between the observed data and the normal distribution fit is unsatisfactory (R 2 ¼ 0.968), as can be seen in Figure 3A.
The difference in performance between the log-normal and normal distributions can be explained by the fact that the former has a floating inflection point while the latter has an invariant one. The inflection point of the log-normal distribution, calculated from Equation (9) using the value of b 2 given in Table 1, is located at c/c 0 ¼ 0.17. To fit the tailing breakthrough curve depicted in Figure 3, it is necessary to shift the inflection point away from the midpoint and toward the (0,0) origin. The log-normal fit with a near-perfect R 2 score of 0.999 is remarkably accurate in tracking the tailing HCl data. This result underscores the fact that a floating inflection point plays a critical role in determining the performance of a probability distribution.
As already indicated, the normal distribution has an invariant inflection point located at the midpoint. As a consequence, it is confined to predicting perfectly symmetric curves and therefore cannot describe the Figure 3 data to a significant degree of precision. In addition to giving a poor fit to the tailing HCl data, the normal distribution predicts a significant nonzero exit concentration at t ¼ 0, as can be seen in Figure 3A. At time zero, the dimensionless exit concentration predicted by the normal distribution is 10.8%, which contradicts the expected value of zero. The log-normal function is undefined at t ¼ 0, but it predicts very small exit concentrations near time zero.

Case 4: hydrogen sulfide breakthrough curve
Another set of gas contaminant breakthrough data reported in the work of Papurello et al.  (2019) is used to test the two probability distributions. The selected data set was obtained with a carbon filter (Norit RST3) subjected to a H 2 S feed stream (c 0 ¼ 76 ppm(v)). The H 2 S data set, presented in Figure 4, differs from the HCl data set shown in Figure 3 in two major aspects. First, there is no immediate H 2 S breakthrough. Second, the H 2 S curve exhibits a significant degree of tailing. Figure 4 plots the two probability distribution fits, calculated using the parameter estimates listed in Table 1, case 4. In comparing the two fits, Figure 4 shows that the log-normal fit is considerably more accurate than the normal fit. However, returning an R 2 score of 0.977, the lognormal distribution seems to have some difficulty in representing this particular data set with strong tailing. The normal fit (R 2 ¼ 0.933) is clearly very poor, bypassing most of the data points and yielding an obvious nonzero exit concentration at time zero. Papurello et al. (2019) used a mechanistic model to describe several sets of H 2 S breakthrough data. It seems that the mechanistic model was used to fit only the initial stages of the observed breakthrough curves. The ability of the mechanistic model to track the entire breakthrough profiles was not investigated.

Linear forms
The normal and log-normal probability distributions can be transformed into linear forms, and linear regression can then be used to estimate their parameters. Examples of such linearized normal and log-normal probability distributions are given by Equations (12) and (13), respectively.
s ¼ ln 1 À y 2 À Á 2 (16) Figure 5A shows that the fit of the linearized normal distribution to the H 2 S breakthrough data is very poor (R 2 ¼ 0.854). The resulting parameter estimates are quite different from those generated by nonlinear regression (Table 2). Figure  5B shows that the fit of the linearized log-normal distribution is more accurate than that of the linearized normal distribution, returning an R 2 score of 0.934. As can be seen in Table 2, the resulting parameter estimates in this case are comparable to those obtained by nonlinear regression. However, the linear regression results are not statistically optimal. It should be mentioned that the R 2 scores for the linear fits indicate how well Equations (12) and (13) fitted transformed data. The parameter estimates obtained by the linear fits can be used to calculate c/c 0 versus t data. Comparison of the calculated data and observed c/c 0 versus t data will produce different R 2 values.
In nonlinear regression, initial values for each unknown parameter in the model must be specified, usually based on intelligent guessing or previous experience. If the initial values are far from the final values, the nonlinear regression procedure can go in the wrong direction and either never converge on a solution or converge on the wrong solution. To guard against this problem, one can use parameter estimates based on linear transformations as initial values for nonlinear regression. In a practical sense, the linearized log-normal distribution given by Equation (13) is quite useful for preliminary data analysis.

Probability distributions and phenomenological models
Interestingly, a handful of phenomenological models based on the assumption of linear adsorption bear a resemblance to the normal distribution. We present three such models here: (a) Equation (17) is a simplified fixed bed model proposed by Lapidus and Amundson (1952), (b) Equation (20) is an asymptotic form of a fixed bed model developed by Rosen (1952), and (c) Equation (23) is an approximate form of a fixed bed model put forward by Anzelius (1926). where where  The last result is mathematically analogous to the log-normal distribution. Given that m 0 ¼ 1 (Xiu et al. 1997), if we let l ¼ a 2 , r ¼ b 2 , and s ¼ t, Equation (27) reduces to Equation (7), the log-normal distribution. The two equations differ in the definitions of their parameters: l, r, and s comprise physically significant parameters, while a 2 and b 2 are empirical parameters. It seems that, under the assumption of linear adsorption, a fixed bed breakthrough curve can be considered as a cumulative probability distribution curve, and it is possible to link the fixed bed's parameters to the probability distribution's parameters.
We have shown that the log-normal distribution with a floating inflection point can fit both symmetric and asymmetric breakthrough curves, while the normal distribution with an invariant inflection point is confined to fitting symmetric ones. Consequently, one can conclude that the model of Xiu et al. (1997) is more versatile and will outperform the models of Lapidus and Amundson (1952), Rosen (1952), and Anzelius (1926) in tracking asymmetric breakthrough curves.

Conclusions
Probability distribution functions can be used to correlate contaminant breakthrough data. In this work, the normal and log-normal distributions have been tested against published breakthrough data for the adsorption of ciprofloxacin, ammonium, HCl, and H 2 S in fixed bed columns. The two distributions were found to provide excellent representations of the slightly asymmetric ciprofloxacin breakthrough curve. For the fronting ammonium breakthrough curve, both distributions provided satisfactory overall fits but failed to track the leakage of ammonium during the initial period of column operation. When challenged with the tailing breakthrough curves of HCl and H 2 S, the normal distribution was found to provide unsatisfactory fits. In contrast, the log-normal distribution was very effective in tracking the two asymmetric breakthrough curves. The superior performance of the log-normal distribution is due to the fact that it has a floating inflection point. Further studies extending to other probability distributions with a variable inflection point are needed to test the validity of the present results.