Performance evaluation of a production line operated under an echelon buffer policy

We consider a production line consisting of several machines in series separated by intermediate finite-capacity buffers. The line operates under an"echelon buffer"(EB) policy according to which each machine can store the parts that it produces in any of its downstream buffers if the next machine is occupied. If the capacities of all but the last buffer are zero, the EB policy is equivalent to CONWIP. To evaluate the performance of the line under the EB policy, we model it as a queueing network, and we develop a method that decomposes this network into as many nested segments as there are buffers and approximates each segment with a two-machine subsystem that can be analyzed in isolation. For the case where the machines have geometrically distributed processing times, we model each subsystem as a two-dimensional Markov chain that can be solved numerically. The parameters of the subsystems are determined by relationships among the flows of parts through the echelon buffers in the original system. An iterative algorithm is developed to solve these relationships. We use this method to evaluate the performance of several instances of 5- and 10-machine lines including cases where the EB policy is equivalent to CONWIP. Our numerical results show that this method is highly accurate and computationally efficient. We also compare the performance of the EB policy against the performance of the traditional"installation buffer"policy according to which each machine can store the parts that it produces only in its immediate downstream buffer if the next machine is occupied.


Introduction
Production lines are the prevailing layout in high-volume discrete-part manufacturing. A production line consists of several machines that are visited by all parts once and in a fixed sequence. The time that a part spends on a machine is often random, due to unpredictable disruptions in the production process and/or variability in the processing requirements of the part. This randomness causes congestion in the line and adversely affects system performance, most notably throughput. One way to increase throughput is to raise the processing rates of the machines starting with the slowest one. Another way is to reduce the variance of processing times causing the congestion. Both approaches require the adoption of good engineering and operating practices at the machine level and possibly investing in new equipment. A third alternative is to reduce the effect of randomness by inserting finite buffers between the machines so that parts flow from machine to buffer to machine and so on until they exit the line. Inserting a buffer between two machines speeds up the line by decoupling the operation of the machines, as long as this buffer is neither full nor empty, hence limiting the propagation of processing time delays.
In the traditional way of operating a line with intermediate buffers, each machine is allowed to store the parts that it produces only in its immediate downstream buffer if the next machine is occupied. We refer to the ensemble of that buffer and the next machine as an installation buffer and to the resulting policy as the Installation Buffer (IB) policy. Under the IB policy, a machine is blocked from processing a part if CONTACT George Liberopoulos glib@mie.uth.gr Supplemental data for this article can be accessed on the publisher's website.
the installation Work-In-Process (WIP) following this machine, which is defined as the number of parts that have been produced by it but have not yet exited the next machine, is equal to the capacity of the installation buffer downstream of the machine.
Inserting buffers between machines comes at a cost of additional WIP inventory, capital investment, and floor space. Depending on the industry, such a cost can be quite high. For instance, in a car manufacturing body shop, the production and material cost of a part may go up to $10 000 based on discussions with experts. With a 5% interest rate, the annual unit cost of WIP can be as high as $500. The capital investment for a single buffer space in a body shop can cost thousands of dollars (see Askin and Fowler (2013) summarizing Lagershausen et al. (2013)); an investment of $5000 -10 000 per space, resulting in an annual capital cost of $250-500 at 5% interest rate, is not unusual. On top of this cost, one may have to account for depreciation over the lifetime of the buffer space, which could be up to 10 years. With a linear depreciation, this could add another $500-1000 to the annual cost of the investment.
The optimal allocation of storage capacity among the intermediate buffers is one of the most widely studied problems in manufacturing systems research. Even if the total capacity has been optimized, storing parts locally in the buffers immediately following the machines does not take full advantage of this capacity. When the cost of buffer space is significant, it may be worthwhile to consider increasing the utilization of the existing buffers before setting out to increase total buffer capacity. One way to increase buffer utilization is to allow the machines to store Copyright ©  "IISE" the parts that they produce in buffers other than their immediate downstream buffers. Such a mode of operation is not unusual in practice, especially in systems where material handling is performed by humans. We have witnessed this in a producer of large industrial conveyor belts where operators transfer WIP material with forklift trucks, in a manufacturer of medium-sized metallic parts where workers transfer parts with trolleys, and in other production environments. The question is, can this be done systematically and, if so, what are the gains and losses in performance?
In this article, we consider a policy aimed at increasing the utilization of buffers in a production line by allowing each machine to store the parts that it produces in any of its downstream buffers if the next machine is occupied, rather than only in its immediate downstream buffer, as is the case under the classic IB policy. The ensemble of all of the downstream buffers and the next machine is referred to as the echelon buffer, and the resulting policy is referred to as the Echelon Buffer (EB) policy. Under the EB policy, a machine is blocked from processing a part if the echelon WIP following this machine, which is defined as the number of parts that have been produced by it but have not yet exited the line, is equal to the capacity of the echelon buffer downstream of the machine.
The terms "installation" and "echelon" originate from inventory control theory, with the term echelon dating back to Clark (1958). In a multi-stage inventory system, under an "installation stock" policy, the ordering decision at each stage (installation) is based on the inventory position at this stage, whereas under an "echelon stock" policy, it is based on the echelon stock position, which is defined as the sum of the installation stock positions at this stage and all of its downstream stages. Axsäter and Rosling (1993) show that for reorder point-reorder quantity rules, echelon stock policies are in general superior to installation stock policies. Given that results for inventory systems do not generally carry over to production systems, due to the limited capacity in the latter systems causing congestion and affecting production time in a non-trivial way, the ultimate goal of this research is to compare the performance of the EB policy against that of the IB policy in the case of production lines.
To evaluate the performance of a production line under the EB policy, we model it as a queueing network, and we develop a method that is based on decomposing this network into as many nested segments as there are buffers and approximating each segment with a two-machine subsystem that can be analyzed in isolation. For the case where the machines have geometrically distributed processing times, we model each subsystem as a two-dimensional Markov chain that can be solved numerically. The parameters of the subsystems are determined by relationships among the flows of parts through the echelon buffers in the original system. An iterative algorithm is developed to solve these relationships. We evaluate the accuracy of this method by comparing it against simulation for several instances of five-and 10-machine lines, including cases where the EB policy is equivalent to constant work in process (CONWIP). Our numerical results show that the method is very accurate and computationally efficient. We also compare the performance of the EB policy against the performance of the IB policy, which we evaluate by simulation. The use of this method to optimally design the echelon buffer capacities is deferred for future consideration.
The remainder of this article is organized as follows. In Section 2, we describe the operation of a production line under the EB policy and discuss some of its advantages and disadvantages. In Section 3, we review the related literature. In Section 4, we describe the discrete-time queuing network model of the EB policy-controlled line. In Section 5, we present the decomposition-based approximation method for analyzing this network. In Section 6, we present the analysis of each subsystem of the decomposition, and in Section 7, we present the analysis of the entire system. In Section 8, we present numerical results on the performance of the decomposition method and on the effect of system parameters on performance. We also compare the performance of the EB policy against the performance of the IB policy. Finally, we draw conclusions in Section 9. Certain derivations and numerical results are included in an online supplement to this article.

Description of a production line operated under an EB policy
We consider a production line consisting of N machines in series, denoted by M n , n = 1, . . . , N, and N − 1 intermediate finite-capacity buffers, denoted by B n , n = 1, . . . , N − 1. Each machine has unit capacity, with the capacity of buffer B n being denoted by C n , where C n ≥ 0, i = 1, . . . , N − 1. Under the EB policy introduced in the previous section, any part produced by machine M n is stored in the echelon buffer following M n . That buffer is denoted by B E n and is defined as B E n = B n ∪ . . . ∪ B N−1 ∪ M n+1 , i = 1, . . . , N − 1. In addition, M n is blocked from processing a part if the echelon WIP following it (i.e., the number of parts that have been produced by it but have not yet exited the line) is equal to the capacity of B E n , which is given by 1 + N−1 m=n C m . This implies that the cap on the total line WIP following M 1 is 1 + N−1 n=1 C n . In contrast, under the traditional IB policy, any part produced by machine M n can be stored only in the installation buffer following M n . That buffer is denoted by B I n and is defined as B I n = B n ∪ M n+1 , i = 1, . . . , N − 1. Moreover, M n is blocked from processing a part if the installation WIP following it (i.e., the number of parts that have been produced by it but have not yet departed from the next machine, M n+1 ) is equal to the capacity of B I n , which is given by 1 + C n . In this case, the maximum total line WIP, which is defined as the WIP following M 1 , is The type of blocking that we have considered here is known as Blocking Before Service with Position Occupied (BBS-PO) and is presumed in the seminal two-machine model of Gershwin and Berman (1981) and in several other works. Another blocking mechanism that is often encountered in manufacturing is Blocking After Service. The analysis under both mechanisms is similar, and the difference in performance between them becomes negligible for large buffer sizes (Dallery and Gershwin, 1992). Throughout this article, we adopt the BBS-PO convention because it leads to a simpler description. Figure 1 shows a production line operated under an EB policy for N = 4, where the circles represent machines and the rectangles represent buffers. To further clarify how the EB policy works, we note that machine M 1 stores the parts that it produces in buffer B 1 , B 2 , or B 3 if M 2 is occupied, with B 1 having the highest priority and B 3 the lowest; hence, Similarly, machine M 2 stores the parts that it produces in buffer B 2 or B 3 if M 3 is occupied. Finally, machine M 3 stores the parts that it produces in buffer B 3 if M 4 is occupied. Furthermore, M 1 is blocked from processing a part if the number of parts that have been produced by it but have not yet exited the line is equal to the capacity of B E 1 , which is 1 + C 1 + C 2 + C 3 . Similarly, M 2 is blocked if the number of parts that have been produced by it but have not yet exited the line is equal to 1 + C 2 + C 3 . Finally, M 3 is blocked if the number of parts that have been produced by it but have not yet exited the line is equal to 1 + C 3 .
If the physical layout of the production line is one where there is a common storage area on the side of the line rather than separate buffers between the machines, then under the EB policy, this area is divided into compartments that play the same role as the intermediate buffers in the traditional serial layout. In this case, the flow of parts is identical to that in the serial layout shown in Figure 1, except that the buffers are clustered compartments, as shown in Figure 2(a).
If the capacities of all intermediate buffers, except possibly the last one, are zero-i.e., C 1 = C 2 = 0 and C 3 = C ≥ 0-then every machine stores the parts that it produces in the last and only buffer if the next machine is occupied. This is shown in Figure 2(b), where the last buffer is denoted by B and is drawn on the side of the line as a common buffer. In this case, the first machine is blocked from processing a part if the number of parts that have been produced by it but have not yet exited the line is equal to 1 + C. No other machine is ever blocked. This way of operation is identical to the operation of a CON-WIP system, where parts are not allowed to be released into the system if the total WIP is at the WIP-cap. Note that here we have adopted the original definition of CONWIP according to which the WIP in the system is capped rather than constant (Spearman et al., 1990), even though CONWIP stands for constant work in process. For the purposes of this article, we will henceforth refer to an EB policy where all buffers except the last one have zero capacities and the last buffer has capacity C ≥ 0, as CONWIP with WIP-cap 1 + C.
Note that a line of three of more machines operated under the EB policy has a lower total WIP-cap than the same line operated under the IB policy. More specifically, for an N-machine line (N ≥ 3), the total WIP-cap under the EB policy is 1 + N−1 n=1 C n , whereas under the IB policy it is N − 1 + N−1 n=1 C n , as mentioned earlier. The difference in WIP-cap between the two policies is significant only for short lines with very low buffer capacities. As a result of this difference, in such lines, it may happen that the EB policy yields lower average throughput and/or WIP than does the IB policy. In general, however, it is expected that a production line operated under an EB policy should have higher average throughput-at the cost of higher average WIP-than the same line operated under an IB policy, due to the higher utilization of buffer space under the EB policy.
An important advantage of the EB policy, in addition to increasing buffer space utilization, is that it uses global information, as it enables each machine to process parts based on the entire echelon WIP level downstream of this machine. This can be beneficial, especially if the WIP holding cost increases significantly downstream the line, as is the case with products that have high added value. In contrast, the IB policy uses only local information, as it enables each machine to process parts based on the WIP level of the local installation buffer immediately following it.
A shortcoming of the EB policy is that it has increased material handling requirements compared with the IB policy. Modern technology, however, can handle such increased requirements at affordable costs (Matta et al., 2005). Today, several affordable modular and reconfigurable material handling solutions exist that are less automated than traditional systems and can be assembled in a flexible way to transport parts in the manufacturing floor (Furmans et al., 2010). Many of the material handling ideas and equipment that are used today in production lines with reentrant flows can also be used to implement the EB policy. The interested reader is referred to two surveys on automated material handling systems in semiconductor manufacturing by Agrawal and Heragu (2006) and Montoya-Torres (2006). The material handling technology for implementing the EB policy can also be found in classic flexible manufacturing systems and their successors, reconfigurable manufacturing systems, where typically pallets are sent back and forth to the work centers. Such movements of material are sometimes referred to as backtracking and bypassing and have been extensively studied in the context of the more general facility layout problem (e.g., see the reviews by Hassan (1994) and Drira et al. (2007)). Finally, the related problem of controlling the flow of automated guided vehicles in manufacturing environments with complex flows has also been extensively studied. Two surveys on this issue are Le-Anh and de Koster (2006) and Vis (2006).
Another issue is distinguishing parts that are stored in the same buffer but are in different stages of their processing. Here, optical or electronic means can be used. Schuler and Darabi (2016) describe a case of a manufacturing facility producing mobile devices where parts in neighboring stages of their production are manually stored in shared buffer clusters. When a part is picked up for processing by a machine, its bar code or Radio-Frequency ID (RFID) is scanned to ensure that the preceding operations have been completed. When the part is processed by the machine, the operator scans the part to inform the information system that it has been processed by this machine, before putting it back to storage. While the machine is processing the part, the system identifies the next part to be picked up for processing via RFID and notifies the operator via a light or some other indicator.

Literature review
The role of intermediate storage buffers in mitigating the adverse effect of process time variability on the efficiency of manufacturing flow lines has been researched for over 50 years. Buzacott's work (1967Buzacott's work ( , 1971) are among the earliest references in English in this topic. In the years that followed, the analysis of flow lines rapidly evolved into a thriving research field with significant practical implications. Many of the core ideas and methods were developed by the 1990s and appeared in numerous papers, surveys, and books (e.g., Dallery and Gershwin (1992), Askin and Standridge (1993), Buzacott and Shanthikumar (1993), Tempelmeier and Kuhn (1993), Papadopoulos et al. (1993), Gershwin (1994), Papadopoulos and Heavey (1996), and Altiok (1997)). Since then, improvements, extensions, and generalizations of previously defined problems have been established and new problems and solution methodologies have been developed. A recent overview and a textbook on the subject include . Most of the issues that have been studied throughout these years fall into one of three categories: (i) modeling aspects; (ii) performance evaluation; and (iii) optimization. A recent literature review concerning these three dimensions can be found in Shi (2012). In the next two paragraphs, we briefly review categories (i) and (ii). We defer the review of category (iii) to future work where we plan to use the method developed in this article to optimally design the buffer capacities under the EB policy and compare the optimized EB policy against other optimized policies.
Most of the modeling aspects of production lines have been covered in Dallery and Gershwin (1992). These aspects concern the stochastic nature of machine processing times, blocking mechanisms, the nature of material flow and time (continuous/discrete), etc. The simplest way of capturing the stochastic nature of machine processing times is to model them as geometrically distributed random variables. Such machines are often referred to as following the Bernoulli reliability model or simply as Bernoulli machines ). The Bernoulli machine model has been extensively used to study various aspects of production lines (e.g., Li and Meerkov (2000), Meerkov and Zhang (2008), and Billier et al. (2009). Its continuous-time equivalent, the exponential processing time model, has also been extensively used in the literature (e.g., Altiok (1997) and Li and Meerkov (2009)). In this article, we adopt the Bernoulli machine model.
As far as the performance evaluation of production lines is concerned, many different techniques have been invoked, including simulation, Markov chain analysis, approximate analytical methods, and decomposition methods, among others. Decomposition methods in particular are two-step approaches that are based on decomposing long lines of many machines and intermediate buffers into several smaller tractable building blocks-usually two-machine, one-buffer pseudo-lines. The buffer in each pseudo-line represents one of the intermediate buffers in the original line. Typically, in the first step of such a method, the performance of each two-machine pseudo-line is evaluated given the parameters of the two machines. Many different models of two-machine systems have been analyzed in the literature, starting from the earlier simpler models (e.g., Buzacott (1967) and Gershwin and Berman (1981)) and advancing to more complex and general models in recent years (e.g., Tan and Gershwin (2009)). In the second step, the parameters of the two-machine pseudo-line are determined by relationships among the flows of parts through the intermediate buffers of the original system. The literature on decomposition methods is extensive, spanning several decades (e.g., Gershwin (1987), Dallery et al. (1988), Levantesi et al. (2003), and Colledani and Gershwin (2013)). Most of the decomposition methods that have been developed concern production lines operated under the traditional IB policy. Under that policy, parts move unidirectionally from upstream to downstream buffers; hence, the decoupling effect of each buffer is clear. Under the EB policy, however, the decoupling effect is more complex, as parts may also move in the opposite direction, from downstream to upstream buffers. To address this complexity, special attention is required.
Finally, we note that the concept of temporarily storing parts in shared buffers when the intermediate dedicated buffers following the machines are full, though used in practice, has not been thoroughly investigated in the literature. Tempelmeier et al. (1989) and later Tempelmeier and Khun (1993) are among the first attempts to model a Flexible Manufacturing System (FMS) with some sharing of buffer space. The FMS consists of several workstations. Each workstation has one or more machines and a local finite buffer. A central buffer is also available for storing parts if there is no space in the local buffers. Parts are mounted onto pallets that come in a fixed number. To evaluate the performance of the system, they model the FMS as a Closed Queueing Network (CQN) with blocking and solve it using numerical approximation techniques (Bruell and Balbo (1980)).
In a related study, Matta et al. (2005) considered a closed flow line with finite dedicated intermediate buffers and a finite shared common buffer that can be used by any machine whose dedicated buffer is full. It takes a certain travel time to move parts from the dedicated buffers to the common buffer and vice versa. This time, if long, may cause the machines to starve. For five-machine lines, they evaluated the throughput rate under different dedicated and shared buffer allocation configurations using simulation. They also discussed useful practical technological and economic considerations concerning the implementation of the shared buffers in real flow lines. In this article, we assume that the time to transfer parts from the machines to remote buffers and back is negligible compared to the processing times on the machines. Even if this time is not negligible, however, it is still possible to neutralize its effect on system performance by carefully planning the transfer of parts from remote buffers to the machines before these machines run out of parts from their local buffers.
There have been several other applications of CQN modeling and analysis to manufacturing and, in particular, kanban and other pull control mechanisms (e.g., Di Mascolo et al. (1996), Baynat et al. (2001), and Satyam and Krishnamurthy (2008)), as well as production lines with finite buffers (e.g., Lagershausen et al. (2013)). In one of these applications, Koukoumialos and Liberopoulos (2005) developed an analytical approximation method for the performance evaluation of a multi-stage production inventory system operated under an Echelon Kanban (EK) policy. The connection between the EK policy and the EB policy becomes evident once the association between the number of available echelon kanbans in EK and the number of available buffer spaces in EB is made. In the EK system, each stage has an input buffer and is an open queueing network of machines with load-dependent continuous-time processing rates. The main production unit in this article, on the other hand, is a Bernoulli machine with no input buffer. As a result of this difference, in the EK system, blockages of parts happen at output buffers rather than on machines, and the resulting blocking mechanism is similar to minimal blocking (Mitra and Mitrani (1989)) rather than BBS-PO. Moreover, in the EK system, the analysis of each subsystem in isolation involves a product-form approximation technique for solving a CQN problem, whereas in the EB system, each subsystem is evaluated using exact discrete-time Markov chain analysis. As a result, the accuracy of the decomposition method is higher in the EB system than it is in the EK system.
Finally, Zhou and Lian (2011) considered a two-stage tandem network where each stage has a single exponential server. Customers arrive to the first stage following a Poisson process. The waiting customers in the two stages share all or part of a common finite buffer. They modeled the system as a two-dimensional Markov chain and computed the stationary probability distribution and the sojourn time distribution. They also presented limited results on the shared buffer size that minimizes total buffer costs subject to minimum customer loss probability and maximum waiting time constraints. Their model, although limited to two servers, is somewhat similar to ours if one considers the external arrivals as being generated by a machine.

Model of a production line operated under an EB policy
In this section, we develop a queueing network model of a production line operated under an EB policy. This model is denoted by L and consists of the N machines of the line, M 1 , M 2 , . . . , M N , separated by N − 1 infinite-capacity buffers, denoted by Q 1 , Q 2 , . . . , Q N−1 . Figure 3 displays the queueing network model of the four-machine line shown in Figure 1. For a general N-machine line model, we make the following assumptions: 1. Parts flow from outside the system to M 1 to Q 1 to M 2 to … to Q N−1 to M N and exit the system. 2. Time is divided in discrete, equal-length periods. 3. In each period, M n produces a part with probability p n unless it is starved or blocked, n = 1, . . . , N. This implies that the processing time of a part on machine M n is geometrically distributed with mean 1/p n , variance (1 − p n )/p 2 n , and squared coefficient of variation 1 − p n . Probability p n is referred to as the production probability (or rate) of machine M n in isolation.

The number of parts in buffer Q n and machine M n+1
is denoted by y n and is referred to as the stage WIP following machine M n , n = 1, . . . , N − 1; hence, Q n is referred to as the stage buffer following M n . Note that y n is a function of the discrete time, but we omit this dependence for notational simplicity. 5. When a part flows from machine M n to buffer Q n , a token is generated and is placed in an associated finite buffer denoted by E n , n = 1, . . . , N − 1. This token is removed from E n and is discarded when the part exits the last machine, M N . The total number of tokens in E n is denoted by x n . Clearly, x n is equal to the number of parts that have been produced by machine M n but have not yet exited the network; i.e., it is equal to the echelon WIP following machine M n in the physical line. It is easy to see that x n is also equal to the sum of the stage WIP levels downstream of M n ; that is, Note that x n , like y n , is a function of the discrete time, but we omit this dependence for notational simplicity. Equation (1) can also be written recursively as 6. The capacity of buffer E n is denoted by K n and is equal to the capacity of echelon buffer B E n , n = 1, . . . , N − 1, in the physical line; that is, The above expression implies that K 1 ≥ K 2 ≥ . . . ≥ K N−1 ≥ 1. Reversely, the intermediate buffer capacities C n can be written in terms of the K n as follows: Given that the capacity of E n in the model is equal to the capacity of echelon buffer B E n in the physical line and that the number of tokens in E n is equal to the echelon WIP following M n in the physical line, we refer to buffer E n , n = 1, . . . , N − 1, as the echelon buffer. 7. Machine M n , n = 2, . . . , N − 1, is starved if y n−1 = 0 or, equivalently from Equation (2), if Machine M 1 is never starved and always has one part in it. 8. Machine M n , n = 1, . . . , N − 1, is blocked if y n−1 ≥ 1 and x n = K n . Machine M N is never blocked. Under the above assumptions, L is a discrete-time queueing network with geometrically distributed service times and BBS-PO. Each machine M n , n = 1, . . . , N − 1, behaves as a disassembly (split) server, as every time M n produces a part, it also generates a token; the part moves to stage buffer Q n if machine M n+1 is occupied, and the token moves to buffer E n . The vertical line at the end of the system in Figure 3 represents an assembly (merge) operation that assembles parts exiting the network with tokens from all buffers E n . More specifically, when a part is produced by machine M N it draws a token from each of the echelon buffers E 1 , . . . , E N−1 , indicating that all echelon WIP levels have dropped by one unit. The finished part leaves the network and the tokens are discarded.
The geometric processing time Assumption 3 is the simplest assumption for capturing the randomness of machine processing times. As we will see in Section 6, the method that we develop in this article for analyzing the system allows us to also deal with the more general case where each machine M n has loaddependent production probabilities, p n (y n−1 ), n = 2, . . . , N. Such a case can be used to model situations where the effective processing times are affected by the workload (Bertrand and van Oijen, (2002)). The existence of such situations has been supported by empirical evidence.
As was mentioned earlier, Assumption 6 implies that K 1 ≥ K 2 ≥ . . . ≥ K N−1 ≥ 1. Now, suppose that K n = K n+1 = K (equivalently, C n = 0, in the physical line) for some n = 1, . . . , N − 2. Then, the echelon WIP levels x n and x n+1 are bounded as follows: x n ≤ K n = K and x n+1 ≤ K n+1 = K. By Assumption 8, machine M n+1 is blocked if y n ≥ 1 and x n+1 = K n+1 = K. If we substitute y n from Equation (2) and replace x n+1 by K, then the condition for M n+1 to be blocked becomes x n − K ≥ 1. However, this condition cannot hold since we assumed that x n ≤ K; therefore, if K n = K n+1 = K, machine M n+1 can never be blocked. In this case, echelon buffer E n+1 is obsolete since it never plays its role of blocking machine M n+1 ; hence, it can be eliminated. With this in mind, note that the behavior of a network in which K n = K ≥ 1, n = 1, . . . , N − 1 (equivalently, C n = 0, n = 1, . . . , N − 2, and C N−1 = C = K − 1 ≥ 0, in the physical line), is equivalent to the behavior of the same network in which all echelon buffers except E 1 (equivalently, all intermediate buffers except B N−1 in the physical line) have been eliminated, as shown in Figure 2(b). The total WIP following M 1 in such a network is capped by K = 1 + C; therefore, the physical line operates under a CONWIP policy, as mentioned in Section 1. Finally, note that although stage buffer Q n has infinite capacity, the number of parts in it effectively is limited by K n .
To further clarify the connection between the physical production line operated under an EB policy, shown in Figure 1, and the queuing network model of that line, shown in Figure 3, consider the following. Let, w n,m denote the number of parts in intermediate buffer B n and machine M n+1 that have been produced by machine M m but not by machine M m+1 , m = 1, . . . , n, in Figure 1. Also, let w n denote the total number of parts in buffer B n and machine M n+1 , n = 1, . . . , N − 1 (see Figure 1). Then, the following relationships hold: Finally, from Equations (1) and (6), we have Note that the quantities w n,m and w n , like x n and y n , are functions of the discrete time, but we omit this dependence for notational simplicity.
In the following section, we develop an approximation method for evaluating the performance of a production line operated under an EB policy based on decomposing the queueing network model described above into several easier to solve subsystems.

Decomposition of the EB-controlled production line model
Let us define the state of the queueing network model of a production line operated under an EB policy described in the previous section as the vector of echelon WIP levels x = (x 1 , x 2 , . . . , x N−1 ). Under Assumptions 1 to 8, x represents the state of a discrete-time Markov chain. To find the number of states of this chain, we note that the echelon WIP levels satisfy x n+1 ≤ x n ≤ K n , n = 1, . . . , N − 2, and 0 ≤ x N−1 ≤ K N−1 . From Equation (1), these inequalities can be written in terms of the stage WIP levels y n as follows: 0 ≤ y n ≤ K n − N−1 m=n+1 y m , n = 1, . . . , N − 2, and 0 ≤ y N−1 ≤ K N−1 . Using these inequalities, we can express the total number of states of the Markov chain under the EB policy, denoted by NS E , as follows: (8) This number can become very large even for problems of moderate size and is generally significantly larger that the corresponding number of states under the classic IB policy, denoted by NS I , given by To get an idea of the relative magnitudes of NS E and NS I , consider a production line with N = 7 machines and intermediate buffer capacities C n = 5, n = 1, . . . , 6, corresponding to echelon buffer capacities K 1 = 31, K 2 = 26, K 3 = 21, K 4 = 16, K 5 = 11, and K 6 = 6, from Equation (3). From Equations (8) and (9), the number of states for this system under the EB and IB policies is NS E = 1404 781 and NS I = 46 656, respectively.
Given the explosion in the number of states of the Markov chain model of a production line operated under an EB policy, in this section, we develop an approximation method for evaluating the performance of such a line. This method is based on decomposing the queueing network model of the original line of N machines and N − 1 echelon buffers into N − 1 nested segments denoted by L n , n = 1, . . . , N − 1, as shown in Figure 4 for the four-machine model depicted in Figure 3.
Segment L n , n = 2, . . . , N − 1, represents the part of the system downstream of machine M n−1 , with segment L 1 representing the entire system. Each segment is then approximated by a two-machine subsystem, denoted byL n , that can be analyzed in isolation. Figure 5 shows the three subsystems,L 1 ,L 2 ,L 3 , that approximate the three segments, L 1 , L 2 , L 3 , shown in Figure 4. Each subsystem can be analyzed independently of the other subsystems, but some of its exogenously defined parameters depend on the analysis of its neighboring subsystems, as we will see later. The ultimate goal of the decomposition is to set the exogenous parameters of each subsystem so that its behavior mimics as closely as possible the behavior of the corresponding segment in the original system. Let us now take a closer look at the subsystems in Figure 5. Each subsystemL n has an upstream infinite buffer Q n−1 (except forL 1 which has none), an upstream machine M n , an intermediate finite bufferẼ n , and a downstream machineM n+1 (except for the last subsystemL N−1 where the downstream machine is denoted by M N ). The first two elements, namely, Q n−1 and M n , represent stage buffer Q n−1 and machine M n in segment L n of the original system; hence, M n has production probability p n , and the number of parts in Q n−1 and M n is denoted by y n−1 , where y n−1 = 0, . . . , K n−1 , as in the original system (see Figure 3).
The ensemble of bufferẼ n and machineM n+1 , in subsystem L n , n = 1, . . . , N − 2, represents in an aggregate way the entire part of the original system downstream of machine M n . This means that it represents segment L n+1 , which in turn, is approximated by subsystemL n+1 . It also represents echelon buffer E n , due to both L n+1 and E n being fed and depleted simultaneously and hence always having the same number of entities in them. With this in mind, the total capacity ofẼ n andM n+1 , just like the capacity of E n and the maximum number of parts in L n+1 , is K n . More specifically,Ẽ n has capacity K n − 1 andM n+1 has unit capacity. Moreover, the total number of parts inẼ n andM n+1 , just like the number of tokens in E n and the number of parts in L n+1 , is denoted by x n , where x n = 0, . . . , K n (see Figure 3). In the original system, clearly, the higher the value of x n , the more likely it is that a part will come out of segment L n+1 . To capture this connection in the approximation method, we assume thatM n+1 has a load-dependent production probability denoted by q n+1 (x n ), x n = 0, . . . , K n . This probability is exogenously defined when analyzing subsystemL n . As we will see later, eventually, it must be equal to the conditional throughput of subsys-temL n+1 (the surrogate of segment L n+1 in the original system), which is denoted by ν n+1 (x n ) (see Figure 5).
In the last subsystem,L N−1 , M N simply represents the last machine in the original system. It is therefore modeled as a simple Bernoulli machine with production probability p N , just like M N in the original system.
Buffer Q n−1 in subsystemL n , n = 2, . . . , N − 1, receives parts arriving from the outside. The arrival process to this buffer represents in an aggregate way the departure process of parts from machine M n−1 in segment L n−1 in the original system. An important property of that machine is that it is blocked if echelon buffer E n−1 is full; i.e., if x n−1 = K n−1 . To capture this property in the approximation method, we require that the arrival process to buffer Q n−1 in subsystemL n , n = 2, . . . , N − 1, depends on x n−1 . More specifically, we assume that Q n−1 receives parts with a state-dependent arrival probability denoted by r n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 . This probability is exogenously defined when analyzingL n and has the property that r n−1 (K n−1 ) = 0. As we will see later, eventually, it must be equal to the internal state-dependent arrival probability of parts to bufferẼ n−1 in subsystemL n−1 (the surrogate of segment L n−1 in the original system), which is denoted by λ n−1 (x n−1 ) (see Figure 5).
Finally, the total number of parts in subsystemL n , n = 2, . . . , N − 1, just like the total number of parts in segment L n of the original system, is denoted by x n−1 ; i.e., x n−1 = y n−1 + x n , where x n−1 = 0, . . . , K n−1 (see also Equation (2)).
To evaluate the performance of the original system L, we must address the following two problems: Problem 1: How can we analyze each subsystemL n in isolation given the exogenously defined state-dependent external arrival probabilities r n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 (except forL 1 that has no external arrivals) and the load-dependent production probabilities of machineM n+1 , q n+1 (x n ), x n = 0, . . . , K n (except forL N−1 where machine M N has production probability p N )?
We address these problems in Sections 6 and 7, respectively. Once these problems have been solved, the performance measures of the original system L can be approximated from the performance measures of subsystemsL n , n = 1, . . . , N − 1.

Analysis of the two-machine subsystems in isolation
In this section, we describe how to analyze each subsystem L n , n = 1, . . . , N − 1, in isolation. First, we concentrate on subsystemsL n , n = 2, . . . , N − 1, that have external arrivals, and then we proceed with the simpler subsystemL 1 that has no external arrivals.
6.1. Analysis of subsystemL n , n = 2, . . . , N − 1 Figure 6 shows the queueing network model of subsystemL n for the more general case where M n has load-dependent production probability p n (y n−1 ), n = 2, . . . , N − 1. We consider this generalization to show that we can easily apply our analysis to the case where machine M n , n = 2, . . . , N, in the original model has load-dependent production probability, as mentioned in Section 4. If we define the state of each subsystemL n as the vector of the WIP levels (y n−1 , x n ), then (y n−1 , x n ) represents the state of a two-dimensional discrete-time Markov chain with state-dependent transition probabilities that are functions of the load-dependent production probabilities p n (y n−1 ), y n−1 = 0, . . . , K n−1 − x n ; the state-dependent arrival probabilities r n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 ; and the loaddependent production probabilities q n+1 (x n ), x n = 0, . . . , K n . This Markov chain is irreducible, finite, and aperiodic; therefore, unique stationary probabilities exist. The number of states of this chain is (K n−1 + 1)(K n + 1) − (K n + 1)K n /2. Figure 7 shows the state transition diagram of this chain for K n−1 = 7 and K n = 4, indicating only the inter-state transitions but not the transition probabilities.
To find the stationary probabilities of this Markov chain, denoted by P n (y n−1 , x n ), we must write the balance equations In what follows, we give the expressions for these equations, where, for notational simplicity, we have r dropped the subscripts from probabilities r n−1 (·), p n (·), q n+1 (·), and P n (·, ·), r used an overbar to indicate the complement of a probability (for instance,p ≡ 1 − p), and r used i and j to denote states y n−1 and x n , respectively.
The form of the balance equations differs depending on whether the states of the Markov chain are in the middle, on the boundaries, or at the corners of the state transition diagram, as indicated in Figure 7. There are nine types of states; hence, there are nine types of balance equations. Figures 8 to 10 show the detailed state transition diagrams for all types of states. Note that the transition probabilities at the extreme states are r(K n−1 ) = p(0) = q(0) = 0; therefore,r(K n−1 ) =p(0) =q(0) = 1. This fact helps reduce the total number of expressions required to describe all balance equations.

Normalization equation:
The balance equations above set the steady-state probability flow rate out of each state equal to the steady-state probability flow rate into this state. To see how they are derived, consider the first equation for the top left corner state (0, 0) that represents the state where subsystemL n is totally empty. The only transition out of that state is to state (1, 0). This transition occurs if a part arrives in queue Q n−1 from the outside. The probability of this event is r(0). Hence, the probability flow rate out of state (0, 0) is P(0, 0)r(0). The only transition into state (0, 0) is from state (0, 1). This transition occurs if no part arrives in Q n−1 and machineM n+1 produces the part in it. The probability of this event isr(1)q(1). Hence, the probability flow rate into state (0, 0) is P(0, 1)r(1)q(1). The balance equation is therefore P(0, 0)r(0) = P(0, 1)r(1)q(1). The other equations are derived similarly by taking into account the three types of events that may or may not occur in each period, namely, the arrival of a part in Q n−1 , the production of a part by M n , and the production of a partM n+1 .
The above system of equations is linear and has a unique solution. It can be solved using any numerical analysis scheme. In our numerical examples, we use the Gauss-Seidel method, where in all iterations we sequentially update the stationary probability of each state using the most recent values of the stationary probabilities of the other states involved. At the end of each of the iterations, we normalize all probabilities. We terminate the iterations when the maximum absolute percentage difference between two successive iterations is below a very small number ε. Once we have computed the stationary probabilities, we can use them to calculate the following performance measures of interest: 1. ν n (x n−1 ), x n−1 = 0, . . . , K n−1 : Conditional throughput of subsystemL n . 2. λ n (x n ), x n = 0, . . . , K n : Internal state-dependent arrival probability of parts to bufferẼ n . 3.x n : Average WIP level of bufferẼ n . 4. θ n−1 : Overflow probability of buffer Q n−1 defined as the probability that y n−1 will increase by one unit when y n−1 ≥ K n−1 − K n + 1 = C n−1 + 1; θ n−1 represents the probability that a part will be produced by machine M n−1 and will be physically transferred for storage in an intermediate buffer downstream of B n−1 because B n−1 is full (hence the term "overflow"). This probability is important especially if the transportation cost associated with this transfer is significant. Note that in the above definitions, we have restored the original notation. The above performance measures are computed as follows: The derivations of the above expressions can be found in Section S1 of the online supplement.

Analysis of subsystemL 1
The first subsystem of the decomposition,L 1 , shown at the top of Figure 5, differs from the other subsystems in that there is no input process to machine M 1 ; hence, it is simpler. Just like M 1 in the original system, machine M 1 in subsystemL 1 is never starved and in every period produces a part with probability p 1 unless it is blocked when bufferẼ 1 is full. If we define the state ofL 1 as the WIP level x 1 , then x 1 represents the state of a discrete-time finite-state birth-death process, for which the stationary probabilities, denoted by P 1 (x 1 ), can be easily computed. The state transition diagram of this Markov chain is shown in Figure 11. As previously, for notational simplicity, we have dropped the subscripts from probabilities p 1 , q 2 (·), and P 1 (·). We have also used an overbar to indicate the complement of a probability, and j to denote state x 1 .
To compute the stationary probabilities of x 1 , we define the following coefficients: In the above expressions, we have exploited the fact that q(0) = 0; therefore,q(0) = 1. The stationary probabilities are given by , j = 0, . . . , K 1 .
Once we have computed the stationary probabilities, we can use them to calculate the average throughput of subsystemL 1 , denoted by ν 1 , and the average WIP level in echelon bufferẼ 1 , denoted byx 1 , where we have restored the original notation. These two measures are calculated as follows: Finally, note that the internal state-dependent arrival probability of parts to bufferẼ 1 , denoted by λ 1 (x 1 ), x 1 = 0, . . . , K 1 , is simply given by

Analysis of the entire original EB-controlled production line model
The unknown parameters of each subsystemL n are the statedependent external arrival probabilities r n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 (except inL 1 , where there are no external arrivals) and the load-dependent production rates q n+1 (x n ), x n = 0, . . . , K n , of the downstream machineM n+1 (except inL N−1 , where the downstream machine is identical to machine M N in the original system and therefore has known production probability p N ). To determine the values of these parameters, we set up a system of equations that relate the flow of parts in subsystemL n with the flow of parts in the neighboring subsystemsL n+1 andL n−1 . More specifically, as we wrote earlier, in subsystemL n , n = 1, . . . , N − 2,M n+1 is an aggregate representation of subsys-temL n+1 , which is the surrogate of segment L n+1 in the original system. The load-dependent production probabilities of M n+1 , q n+1 (x n ), x n = 0, . . . , K n , should therefore be equal to the conditional throughput ofL n+1 , ν n+1 (x n ), x n = 0, . . . , K n . Similarly, in subsystemL n−1 , n = 2, . . . , N − 1,M n is an aggregate representation of subsystemL n . The external arrival process to buffer B n−1 inL n , r n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 , should therefore be equal to the internal state-dependent arrival process of parts from machine M n−1 to bufferẼ n−1 inL n−1 , λ n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 .
For each subsystemL n , the conditional throughput ν n (x n−1 ) and the internal state-dependent arrival probability λ n (x n ) can be computed by analyzing the subsystem in isolation, given the values of the production probabilities p n (y n−1 ), y n−1 = 0, . . . , K n−1 − x n , and q n+1 (x n ), x n = 0, . . . , K n , as shown in Section 6. This means that ν n+1 (x n ) in Equation (17) is a function of p n+1 (y n ) and q n+2 (x n+1 ). Also, λ n−1 (x n−1 ) in Equation (18) is a function of p n−1 (y n−2 ) and q n (x n−1 ). Hence, the unknown parameters r n−1 (x n−1 ) and q n+1 (x n ) in Equations (17) and (18) are the solution of a fixed-point problem. To determine their values, we use the following iterative algorithm.
Algorithm for analyzing the entire production system Step 1. Initialization: 1.1. Set the unknown external arrival probabilities of each subsystemL n (exceptL 1 , which receives no external arrivals) to some initial value. A reasonable initial value that we have used in our numerical experiments is the smallest production probability of all machines upstream of M n , namely, 1.2. Set the unknown production rates of machineM n+1 in each subsystemL n (exceptL N−1 , where the production rate of the downstream machine is p N ) to some initial value. A reasonable initial value that we have used in our experiments is the smallest production probability of all machines downstream of M n , namely, (20) Step 2. Main Iteration: Iterate backwards and forwards until the external and internal arrival probabilities converge; i.e., until r n−1 (x n−1 ) = λ n−1 (x n−1 ), x n−1 = 0, . . . , K n−1 , n = 2, . . . , N − 1. More specifically, and θ N−1 from (11)-(13), respectively, for n = N − 1.
Note that the first time each subsystemL n , n = 2, . . . , N − 1, is solved using the method presented in Section 6, the stationary probabilities of the Markov chain whose states are (y n−1 , x n ) must be initialized. The simplest way to do this is to set them all equal so that their sum is one. A more sophisticated way is to set P n (y n−1 , x n ) equal to the normalized product of the approximate marginal stationary distributions of y n−1 and x n in isolation. The approximate marginal distribution of y n−1 in isolation can be found by solving a two-machine one-buffer line (as a discrete-time finite-state birth-death process), where the upstream and downstream machines have production probabilities r init n−1 (x n−1 ) and q init n (x n−1 ) given by Equation (19) and Equation (20), respectively. Similarly, the approximate marginal distribution of x n in isolation can be found by solving a twomachine one-buffer line, where the upstream and downstream machines have production probabilities r init n (x n ) and q init n+1 (x n ) given by Equation (19) and Equation (20), respectively. These problems can be solved extremely quickly. From then on, each time subsystemL n , n = 2, . . . , N − 1, is solved again, the stationary probabilities from the previous time are used as initial values. Numerical experimentation has shown that this method results in significant gains in overall computational time.
Finally, the criterion that we used to detect if λ n (x n ) ≈ r n (x n ), x n = 0, . . . , K n − 1, in step 2 of the above procedure is max x n =0,...,K n −1 {|λ n (x n ) − r n (x n )|/r n (x n )} < ε, where ε is a very small number.

Numerical results
In this section, we evaluate the accuracy and efficiency of the decomposition method developed in Sections 6 and 7 by comparing it against simulation for several instances of two numerical examples, also exploring the effect of system parameters on system performance. In all instances, we used the value of ε = 0.0001 for the convergence criterion both in the procedure for analyzing each subsystemL n in isolation, described in Section 6.1, and in the algorithm for analyzing the original system L, described in Section 7. Regarding the convergence of these two algorithms, we know from Markov chain theory that the balance equations have a unique solution, due to the underlying Markov chain being irreducible, finite, and aperiodic. This means that the procedure for analyzing each subsystemL n in isolation should always converge. Although we cannot similarly guarantee the convergence of the algorithm for analyzing the entire system, we can attest that in all of the instances that we ran, the algorithm converged. Both the decomposition and simulation algorithms were written in Matlab and were run on a PC with an Intel(R) Core(TM) i7-920 CPU @ 2.67 GHz (Supplier: Mentor Hellas Ltd: Scientific Engineering Software; Location: 15351 Pallini Attiki, Greece).
In Example 1, we consider a production line consisting of N = 5 machines and four buffers. For this system, we evaluated 34 different instances (cases). Table 1 shows the input data for each case, namely, the production probabilities of the machines, p n , n = 1, . . . , 5, the capacities of the intermediate buffers, C n , n = 1, . . . , 4, and the resulting capacities of the echelon buffers, K n , n = 1, . . . , 4, computed from Equation (3).
The cases are divided into three groups as far as the distribution of the production probabilities among the machines is concerned. Cases 1 to 4, 16, 17, 23, and 29 represent balanced lines where all machines have the same production probabilities. Cases 5 to 13, 18 to 20, 24 to 26, and 30 to 32 represent lines where all of the machines have the same production probabilities, except for one that has a smaller probability, making it the slower machine. That machine is either the first, middle, or last machine. Finally,cases 14,15,21,22,27,28,33,and 34 represent unbalanced lines where the machines have increasing or decreasing production probabilities.
In terms of the intermediate buffer capacity allocation, the cases are divided into two groups. Cases 1 to 16 represent lines where the capacities of all intermediate buffers are the same, implying that the echelon buffer capacities increase by the same amount as we move upstream the line. In cases 17 to 34, the capacities of all intermediate buffers except the last one are zero, implying that all echelon buffer capacities are the same. As mentioned in Section 4, this corresponds to a line operating under CONWIP. Table 2 shows the performance measure estimates of the EB policy obtained by decomposition. These measures are the average stage WIP levels, denoted byȳ n , n = 1, . . . , 4; the average line throughput, denoted by ν; the average overflow rate of buffer B n , denoted by θ n , n = 1, . . . , 3; and the computation time, CPU, in seconds. We report the average stage WIP levels rather than the average total WIP levels because often the inventory holding cost rate differs at different stages; typically, it is increasing in the stages because of the value added at each stage. Therefore, it is important to explore the accuracy of the decomposition method at the individual stage level rather than at the level of the entire production line. Recall that the values ofȳ n , ν, and θ n are computed as the final values ofȳ n , ν 1 , and θ n in the algorithm described in Section 7. Note that θ 4 in the five-machine example  and, more generally, θ N−1 in the N-machine case, is zero because there is no overflow of parts in the last buffer B N−1 . Table 3 shows the percentage difference between the decomposition and simulation estimates, which are displayed in Table . Percentage difference in performance measure estimates of the EB policy obtained by decomposition and simulation for the five-machine line in Example . stage-then the total weighted inventory cost could end up being higher in the former line than it is in the latter line, even though the total average WIP is smaller under CONWIP. Moreover, the former line also yields higher overflow probabilities than the latter line, resulting in a higher transfer rate-and hence cost-of parts to remote buffers. Finally, to explore the difference in performance between the EB and IB policies, we simulated the five-machine production line under the IB policy for the 34 cases in Table 1. The performance measure estimates of that policy are shown in Table S2 in the online supplement (Section S2). In all cases, except case 11, both the average throughput and the average total stage WIP level under the EB policy are higher than their respective values under the IB policy. As expected, the biggest differences in average throughput (60% and above) are observed in cases 17 to 34, where the EB policy is equivalent to CONWIP. Not surprisingly, these are the cases with the biggest differences in the average total stage WIP. Case 11 is the only case where the average throughput and the average total stage WIP level under the EB policy are lower than their respective values under the IB policy. As mentioned in Section 1, this can happen in short lines with low buffer capacities where the WIP-limit under the EB policy is significantly smaller that the WIP-limit under the IB policy. Case 11 fits this description because the WIP-limit under the EB policy is five, whereas the WIP-limit under the IB policy is eight. However, the low buffer capacity is not the only reason that the average throughput and the average total stage WIP level under the EB policy are lower than their respective values under the IB policy in case 11. For example, note that cases 1, 5, and 8 have the exact same buffer allocation as case 11 but result in higher average throughput and average total stage WIP level under the EB policy than they do under the IB policy. The difference between these cases and case 11 is that in case 11 there is a slower machine at the end of the line. That machine seems to block the release of new parts into the line more frequently under EB, resulting in a reduced average throughput and average total stage WIP level compared with IB. Finally, recall that a disadvantage of the EB policy compared with the IB policy is that in the former policy, parts are transferred for storage to remote downstream buffers at rates equal to the overflow probabilities. This transfer may incur a cost. Under the IB policy, on the other hand, no part is ever transferred to a remote buffer.
In Example 2, we consider a production line consisting of N = 10 machines and nine buffers. For this system, we evaluated 27 different instances. The rationale behind the choice of parameter values for the different instances is similar to that in Example 1. For space considerations, the input data and the results for each instance are presented in Section S3 in the online supplement. The observations on the results of Example 1 presented above still hold for the results of Example 2. One important difference is that the computational time of the decomposition method in Example 2 is higher than it is in Example 1. This is natural because in Example 2, there are twice as many stages (machines) and-more important-the echelon buffer capacities are much higher. Nonetheless, in most cases, the computational time of the decomposition method still remains significantly lower than the corresponding time of simulation.

Conclusions
We introduced the EB policy for controlling the flow of parts through a production line and we developed a decompositionbased approximation method for evaluating its performance. Our numerical results indicate that this method is computationally efficient and highly accurate when compared with simulation. They also indicate that an EB policy where the entire buffer space is allocated to the last intermediate buffer (CON-WIP) yields higher average throughput and lower average WIP than the same policy in which the buffer space is evenly allocated among all intermediate buffers. The tradeoff is that the concentration of the average total WIP toward the downstream stages and the overflow probabilities are higher in the former case than they are in the latter case. At the same time, the EB policy generally yields higher average throughput, at the cost of higher average WIP and non-zero overflow probabilities, than the IB policy. Based on these results, a promising direction for future research is to use the developed approximation method to optimally design the echelon buffer capacities and compare the performance of the resulting optimal EB policy against that of the optimal IB and CONWIP policies. Another possible direction is to generalize the decomposition method for more complicated machine behavior models than the Bernoulli model. Even under the Bernoulli machine assumption, however, it would also be useful to come up with a more efficient way to analyze the twomachine subsystems in isolation in the decomposition method.