Price Setting with Menu Cost for Multi-Product Firms

We model the pricing decisions of a multi-product firm that faces a fixed “menu” cost: once the cost is paid, the firm can adjust the price of all its products. We characterize analytically the steady state firm’s decision in terms of the structural parameters: the variability of the flexible prices, the curvature of the profit function, the size of the menu cost, and the number of products that are sold. We provide expressions for the steady state frequency of adjustment, the hazard rate of price adjustments, and the size distribution of price changes, all in terms of the structural parameters. We study analytically the impulse response of aggregate prices and output to a monetary shock. The cumulative response of output to a monetary shock is the product of three terms: the steady state standard deviation of price changes, the average time elapsed between price changes, and a function of both the number of products and the size of the monetary shock. The size of the cumulative response of output and the length of the half-life of the response of aggregate prices to a monetary shock increase with the number of products, both of them more than double as the number of products goes from 1 to ten, quickly converging to the ones of Taylor’s staggered price model.


Introduction and Summary
The "small" price changes observed in the micro data challenge a basic prediction of classic menu-cost models, such as e.g. Barro (1972), Dixit (1991), Golosov and Lucas (2007): when calibrated to yield infrequent price adjustment, as observed in the data, these models predict price changes that are "large". Motivated by this evidence, Lach andTsiddon (1996, 2007) introduced the hypothesis of economies of scope in price setting: a firm pays a fixed menu cost to simultaneously adjust the price of n goods. In this case small adjustments occur because, once the firm decides to take action as some goods are mispriced, then it will also find it convenient to adjust the prices of all other goods at no extra cost. There is some evidence consistent with this hypothesis: Lach and Tsiddon (1996) show that price changes are synchronized within stores, but staggered across stores, Cavallo's (2010) study of online large supermarket chains finds that price changes of similar goods are synchronized within a store, and Bhattarai and Schoenle (2010) document that firms selling more goods display a higher frequency of price adjustment as well as smaller size adjustments. 1 Midrigan (2011) studies numerically a model which departs from Golosov and Lucas (2007) by having economies of scope in adjusting the price for two goods as well as fat-tailed shocks. Interestingly he finds that with both of these modifications real effects of monetary policy shocks are substantially larger.
The fundamental motives behind the different results in Golosov and Lucas (2007) and Midrigan (2011) remain largely unexplored, partly because these models are involved: studying the dynamic response of an economy with heterogenous agents requires a model of the cross-section distribution of firms and an analysis of the impact and convergence to steady state that follow a monetary shock, and partly because, as explained above, they differ in several dimensions. The challenges involved with solving this problem have led previous authors to resort to numerical methods. We think that the lack of analytical results on the impact and propagation of shocks in menu cost models prevents a clear understanding of the key determinants of the effects of monetary shocks. This paper contributes to the analysis of this problem by presenting two sets of novel results. First, we characterize analytically the decision problem, as well as many of the steady state statistics, of firms that have economies of scope in the price setting for arbitrary n ≥ 1 goods. This is useful to guide the empirical literature using micro data to test the hypothesis on the economies of scope in price setting, and to identify and estimate the relevant parameters. For instance the distribution of price changes for the case of n = 1 and n = 2 are bimodal, and instead for n ≥ 4 it is bell shaped, 1 An incomplete list of additional contributions documenting this type of behavior includes Lach and Tsiddon (1992), Baudry et al. (2007), Dhyne and Konieczny (2007), Dutta et al. (1999), Midrigan (2007Midrigan ( , 2011, and Neiman (2010).
with a large mass of small price changes, which better describes most empirical distributions.
Second, we develop an analytical characterization of the whole impulse response function of the aggregate price level (and of output) to a monetary shock. This characterization extends the pioneering contributions of Caballero and Engel (1991, 1993 beyond the analysis of the impact effect, to any number of goods n ≥ 1, and justifies some of their simplifying assumptions. The results illustrate how the effects of monetary shocks depend on n, and on two simple statistics: the standard deviation and the frequency of price changes. These findings are useful to identify what observable statistics are informative for answering the important question on the effectiveness of monetary policy in actual economies, and to interpret the numerical investigations of Golosov and Lucas (2007) and Midrigan (2011).

Summary of the firm's problem and steady state statistics
We study a stylized version of the problem of a multi-product firm that can revise prices only after paying a fixed cost. The key assumption is that once the fixed menu cost is paid the firm can adjust the price of all its products. The analytical solution of this problem, whose formulation was outlined in Lach andTsiddon (1996, 2007), is novel in the literature.
Section 2 sets up the firm problem as to minimize the deviations of the profits incurred relative to the flexible price case, i.e. the case with no menu cost. We assume that the static profit maximizing prices for each of the n products, which coincide with the price that would be charged without menu cost, follow n independent random walks without drift and with volatility σ per unit of time. We refer to the vector of the difference between the frictionless prices and the actual prices charged as the vector of the price gaps. The period return function is assumed to be proportional to the sum of the squares of the price gaps.
The proportionality constant B measures the second order per period losses associated with charging a price different from the optimum, i.e. it is a measure of the curvature of the profit function. 2 We assume that if a fixed cost ψ is paid the firm can simultaneously change all the prices. The firm minimizes the expected discounted cost, which includes the stream of lost profits from charging prices different from the frictionless as well as the fixed cost at the time of adjustments. Alternatively, as explained below, this setup can be interpreted as a rational inattention problem, where ψ represents an information gathering cost. We completely characterize the solution of the problem in terms of the structural parameters: the variability of the flexible prices σ, the curvature of the profit function B, the size of the menu cost ψ, the discount rate r, and the number of products n. Our characterization is slightly more general than the seminal work of Barro (1972), Karlin and Taylor (1981) and Dixit (1991) for the case of n = 1 and, more importantly, it includes the multi-product case of n ≥ 1.
The technical challenges in the analytical study of price setting problems with menu cost have led researchers to consider simple environments. For instance a quadratic profit function, or a quadratic approximation to it, has been used in the seminal work on price setting problems with menu cost by Barro (1972), Tsiddon (1993), section 5 of Sheshinski and Weiss (1992), Caplin and Leahy (1997), and chapter 12 of Stokey (2008), among others.
Moreover the idiosyncratic shocks considered are stylized, e.g. random walks with constant volatility, as used in Barro (1972), Tsiddon (1993), Gertler and Leahy (2008) and Danziger (1999) among others. Facing the same challenges, we adopt similar assumptions for each of the goods, which allows the complete characterization mentioned above. Regretfully, the simplifications needed preclude the analysis of the case of asymmetric demands for goods subject to the common adjustment cost, or of correlated shocks to each of the goods.
The solution of the firm's problem in Section 3 involves finding the set over which prices are adjusted, and the set where they are not, i.e. the inaction set. Due to the lack of drift, when prices are adjusted they are set equal to the frictionless prices, i.e. the price gaps are set to zero in all dimensions. To our knowledge this is the first fixed cost adjustment problem in n-dimensions whose solution is analytically characterized. We believe that this is because of the difficulty of finding a tractable boundary condition and a candidate solution that is smooth enough on the boundary of the inaction region. We show that the optimal decision is to control the price gap as to remain in the interior of the n-dimensional ball centered at the origin. The economics of this is clear: the firm will adjust either if many of its price gaps have a medium size, or if a few gaps are very large, since at the margin a larger gap hurts profits more. The size of this ball, whose square radius we denote byȳ, is chosen optimally. We solve for the value function and completely characterize the size of the inaction setȳ as a function of the parameters of the problem. We also obtain a very accurate approximation for a small cost ψ, which shows thatȳ takes the form of a square root function,ȳ ≈ [2(n + 2)σ 2 B/ψ] 1/2 .
In Section 4 we explore several steady-state implications of the model. First, we show that the expected number of price adjustments per unit of time, denoted by N a , is given by nσ 2 /ȳ, which together with our result forȳ gives a complete characterization of the frequency of price adjustments. Second, we solve in closed form for the hazard rate of the price changes as a function of the time elapsed since the last change. The shape of this function, except for its scale, depends exclusively on the number of products n. The scale of the function is completely determined by the expected number of adjustment per unit of time N a , which we have already solved for. For a given n, the hazard rates are increasing in duration, have an elongated S shape, with a finite asymptote. Comparing across values of n, while keeping the expected number of adjustment constant, we show that the asymptote of the hazard rate is increasing in n. As n increases, an adjustment becomes more likely later on. As n → ∞ the hazard rate function converges to an inverted L shape, i.e. to the one of models with deterministic adjustments as in Taylor's (1980) or in Reis's (2006) models. 3 Third, we characterize the shape of the distribution of price changes. While price changes occur simultaneously for the n products, we characterize the marginal distribution of prices, i.e. the statistic that is usually computed in actual data sets. A closed form expression for the density of the marginal distribution of price changes as a function ofȳ and n is given. Using this density we compute several statistics that have been computed in the data, such as the standard deviation of price changes Std(∆p), and other moments which are only functions of n, such as the coefficient of variation and the excess kurtosis of the absolute value of price changes. We show that, as the number of products increases, the size of the adjustments decreases monotonically, i.e. with more products the typical adjustment is smaller in each product. These cross-section predictions can be used to identify the parameters of the model and test its implications. When compared to the tabulations for US data by Bhattarai and Schoenle (2010) we find matching patterns: higher values of n imply higher dispersion, smaller average price changes, and fatter tails. We show that once the size of price changes is controlled for, the shape of the sizedistribution is exclusively a function of the number of products n. For n = 2 the distribution is bimodal, with modes at the absolute value of √ȳ , for n = 3 it is uniform, for n = 4 it peaks at zero and it is concave, and for larger n it is bell-shaped. As n → ∞, the density of price changes converges to a Normal. We find the sensitivity of the shape of price changes with respect to n an interesting result to identify different models of price adjustments. In particular bimodality is only predicted for n = 1 or n = 2, as in the models of Golosov and Lucas (2007) and Midrigan (2011) respectively. 4 Furthermore, this helps to discriminate with respect to other theories of price adjustments, such as e.g. Alvarez, Lippi, and Paciello (2011b), which is based on a mixture of information and menu cost.

Summary of the impulse response to a monetary shock
In Section 5 we use the analytical characterization of the firm's problem to derive the impulse response of a monetary shock on the aggregate price level and output. We study the effect of a shock keeping the decisions rules of the firms constant, an approximation used in some of the calculations by Golosov and Lucas (2007), Caballero and Engel (1991, 1993 among many others. Indeed, we justify this practice in our model by showing that the general equilibrium feedback effects have a third or smaller order effect in the firm objective function, so when aggregate shocks as well as adjustment cost are small, these effects are negligible on the size of the inaction region -a closely related result to the one in Gertler and Leahy (2008). 5 In particular, we characterize analytically the effect on aggregate prices of a permanent unexpected increase in money supply in an economy that starts at the cross sectional stationary distribution of price gaps under zero inflation. This impulse response has been tacked numerically by Midrigan (2007) for a model with n = 2 (as well as fat tail shocks), and compared with the case of n = 1 case, analyzed by Golosov and Lucas (2007).
To understand the effect of a monetary shock note that, in the language of Golosov and Lucas (2007), economies with different values of n have a different amount of "selection".
We show that as n increases there are more firms close to the adjustment threshold, so that more firms adjust after a shock. But as n increases the average size of the price changes for adjusting firms is smaller. The net effect of these two forces on the aggregate price level depends on both the size of the monetary shock and on the number of products n. 6 We obtain an analytical characterization of the impulse response function (IRF for short) of prices to a monetary shock, which we divide into the impact effect (a jump in the price level), and the remaining part. The impulse response function depends only on two parameters: the steady state frequency of price changes, N a , and the steady state standard deviation of price changes, Std(∆p), two statistics that are readily available for actual economies. More precisely, the IRF is homogenous of degree one in the size of the shock, δ, and in Std(∆p).
Instead the duration of the impulse response is inversely proportional to the steady state frequency of price changes, N a , i.e. time can be measured relative to the steady state average duration of prices. When monetary shocks are larger than twice Std(∆p) the economy features complete price flexibility. Instead, for small monetary shocks the impact effect on prices is second order compared to the shock size -and hence the impact effect on output is of the order of the monetary shock. These results, together with the homogeneity, characterize the precise sense in which the size of the shocks matters. Fixing the two steady state parameters -N a and Std(∆p)-the whole shape of the impulse response depends only on the number of products n and the normalized size of the shock δ/Std(∆p).
For empirically relevant sizes of monetary shocks -i.e. small relative to Std(∆p)-economies with more products display a smaller impact effect on prices and a more protracted response, i.e. they are "stickier". As we move from n = 1 to a large number of products (say n ≥ 10) the impact effect on prices as well as the half life of a monetary shock more than doubles. Indeed as n → ∞ the IRF converges to the one corresponding to the staggered price setting of Taylor's (1980) model, or the rational inattention Reis's (2006) model. In this case there is absolutely no selection and the impulse response is linear in time, and has -for small shocksa half-life of 1/(2 N a ), i.e. half the average duration of steady state price changes. Finally, we analyze the implied effect on output which, in general, has a longer half-life than the impact on prices. We use the area under the IRF as a summary measure of the output effect. We show that this area is proportional to the ratio of the two steady state statistics, Std(∆p)/N a and, for a given n, depends on the size the (normalized) monetary shock. We show that the largest cumulated effect on output is small, in the order of 1 per cent, and that it is increasing in n, more than doubling as n goes from 1 to ∞, which is almost achieved for n ≈ 10.

The firm's problem: setup and interpretation
Let n be the number of products sold by the firm. The mathematical model we use has an n-dimensional state p that we refer to as the vector of price gaps, whose interpretation is discussed below. Each price gap p i , while it is not controlled, evolves according to a random walk without drift, so that dp i = σ dW i where dW i is a standard Brownian Motion. The n Brownian Motions (BM henceforth) are independent, so E [W i (t)W j (t ′ )] = 0 for all t, t ′ ≥ 0 and i, j = 1, ..., n. The value function V (p) is the minimum value of the function V 0 defined over the processes {τ j , ∆p i (τ j )} ∞ j=1 : So that τ j are the (stopping) times at which control is exercised. At these times, after paying the cost ψ, the state can be changed to any value in R n . We denote the vector of changes in the price gaps as ∆p(τ j ) ∈ R n . This is a standard adjustment cost problem subject to a fixed cost, with the exception that after paying the adjustment cost ψ the decision maker can adjust the state in the n dimensions.
Next we discuss two interpretations of the problem, that can be summarized by saying that the firm "tracks" the prices that maximize instantaneous profits from the n products.
In both interpretations a monopolist sells n goods with additively separable demands; in the first one subject to costs shocks, and in the second subject to demand shocks. For the first interpretation consider a system of n independent demands, with constant elasticity η for each product, random multiplicative shifts in each of the demand, and a time varying marginal (and average) cost W Z i (t). This is a stylized version of the problem introduced by Midrigan (2011) where the elasticity of substitution between the products sold within the firm is the same as the one of the bundle of goods sold across firms. The instantaneous profit maximizing price is proportional to the marginal cost, or in logs p * i (t) = log W +log Z i (t)+log (η/(η − 1)). In this case we assume that the log of the marginal cost evolves as a random walk with drift so that p * i (t) inherits this property. The period cost is a second order expansion of the profit function with respect to the vector of the log of prices, around the prices that maximize current profits (see Appendix B for a detailed presentation of this interpretation). The units of the objective function are loss profits relative to the the value of the current maximum profits for the n goods. The first order price-gap terms in the expansion are zero because we are expanding around p * (t). There are no second order cross terms due to the separability of the demands. Thus we can write the problem in terms of the gap between the actual price and the profit maximizing price: p(t) =p(t) − p * (t). Under this approximation the constant B is given by B = (1/2)η(η − 1)/n, where n appears in the denominator since the cost of the deviations for the n price gaps is divided by the total profits generated by the n goods. 7 The fixed cost relative to the profit of the n products is then ψ/n. Clearly all that matter to characterize the decision rules is the ratio of B to ψ, thus in equation (1) we omit the terms that are common (such as n and the expression for total profits) which only scales the units of the value function. For the second interpretation of the model, consider a monopolist facing identical demands for each of the n products that she sells. The demands are linear in its own price, and have zero cross partials with respect to the other prices. The marginal cost of producing each of the products are also identical, and assumed to be linear. The intercepts of each of the n demands follow independent standard BMs. In this interpretation the firm's profits are the sum of the n profit functions derived in the seminal work by Barro (1972), so that our ψ is his γ and our B is his θ, as defined in his equation (12).
Alternatively, the same set-up can be interpreted as a rational inattention problem similar to the one studied by Reis (2006) and Alvarez, Lippi, and Paciello (2011b). The firm has the same demand system for n products, and hence the same total period losses B||p(t)|| 2 , which are assumed to be continuously and freely observed. Furthermore, if the firm pays an observation cost ψ, it observes the determinants of the profits of each of the products separately, and is able to set prices based on this information. In this case ψ represents the cost of gathering and processing the information, in addition to (or instead of) the menu cost of changing prices.
To understand the mechanics of the model and to interpret differences in behavior in a cross section of firms, we will consider two technologies for scaling the cost of adjustment ψ as n varies. The first one, which we refer to as constant returns to scale technology (CRS), scales the cost linearly with n, so that ψ = n ψ 1 . The second one, which we refer to as the constant fixed cost technology (CFC), assumes ψ = ψ 1 , so that the cost does not vary with n. We think that these two extreme cases bracket all of the interesting setups.

Characterization of the solution
We first note the following basic properties of the value function and the optimal policy. 1. Given the symmetry of the BM and of the objective function around zero, and the independence of the BM's, one can use reflection around zero to show that the value function only depends on the absolute values of p i , i.e. V (p) = V (|p 1 |, |p 2 |, ..., |p n |) for all p ∈ R n .
2. Given the lack of drift of price gaps and the symmetry of the return function, of the law of motion and of target prices, it is easy to see that after an adjustment the state is reset at the origin, i.e. p(τ + j ) = 0, or ∆p(τ j ) = −p(τ − j ). See Lemma 1 in Appendix A for a formal argument.
3. The state space R n can be divided in two open sets: an inaction region I ⊂ R n and control region C ⊂ R n . We use ∂I for the boundary of the inaction region. We have that C ∩ I = ∅, that inaction is strictly preferred in I, that control is strictly preferred in C, and that in ∂I the agent is indifferent between control and inaction.
We write down the conditions for the solution of the problem, provided that the value function is smooth enough, i.e. we look for a solution of the "strong" formulation of the problem with: V ∈ C 1 (R n ) and V ∈ C 2 (R n \∂I), so the function is once differentiable in the whole domain, and twice differentiable everywhere, but in the boundary of the inaction set. In the range of inaction the cost for the firm is given by the following Bellman equation: for all p ∈ I. In the control region we have: for all p ∈ C. The optimality of returning to the origin implies that, V i (0, 0, ..., 0) = 0 for all i = 1, 2, ..., n.
Finally, differentiability in the boundary of the inaction region gives V i (p 1 , p 2 , ..., p n ) = 0 for i = 1, 2, ..., n and for all p ∈ ∂I, We refer to this condition as smooth pasting.
We briefly comment on the control theory results that apply to our problem. A recent statement of the general problem, and an existence results of a viscosity solution, is given in Baccarin (2009). Instead we look for a strong solution, i.e. a smooth one. 8 Øksendal (2000) and Aliev (2007) analyze a general class of slightly simpler stopping time problem in n dimensions. They consider a problem with a one time decision of when to collect a given reward function of the state, denoted by g. Before that time the decision maker has either zero flow returns or, in the case of Øksendal (2000), she receives a flow return f that depends on the state. The decision maker maximizes the expected discounted value of the reward. 9 Their problem maps into ours by making the reward g(p) = V (0) + ψ and the flow return f (p) = B||p|| 2 . Aliev (2007) shows that equation (6) is necessary for optimality, provided that p ∈ ∂I is a regular point for the stopping set C with respect to the process {σW(t)} and that the derivatives of the value function in a neighborhood of ∂I are bounded. Theorem 10.4.1 in Øksendal (2000) is a verification theorem in term of variational inequalities which, when adapted to our set-up, says that if there exists a function V that satisfies equations (3)-(6) 8 Theorem 1 in Baccarin (2009) shows the existence of a continuous value function V and a policy described by a continuation and control region for a class of problems that includes ours. His set-up includes a more general form of adjustment cost, period return function, and law of motion for the state, as well as weaker differentiability assumption on these functions. Strictly speaking, our problem does not fit one of the assumptions for Theorem 1. In particular, Assumption (2.4) requires that the cost diverges to infinity as the norm of the adjustment diverges. Nevertheless, we can artificially modify our problem by incorporating a proportional adjustment cost that applies only when ||p|| is very large, without altering our solution.
9 Discounting, not included in this analysis, is easy to introduce it by taking time as one of the n states. and some additional conditions −which we state and check in our proof− then this function solves the stopping time problem. Before presenting the solution of this problem we change the state space, which we summarize using a single variable. Let measure the deviation of prices from their optimal value across the n goods. We consider policies summarized by a single numberȳ. In this class of policies the firm controls the state so that if y <ȳ, there is inaction. The first time that y reachesȳ, all prices are adjusted to the origin, so that y = 0. We will find the optimal policy in this class. Then we will show that the optimal policy of the original problem is of this form.
The variable y measures the square of the ray of a sphere centered on the origin. Since each price gap follows an identical independent standard BM (when uncontrolled), then y follows a simple diffusion in the inaction region. Using Ito's Lemma on equation (7) the evolution of y is This implies that the quadratic variation of y is: Thus we can define a stochastic differential equation for y with a new standard BM {W(t)} that solves: We note that for the unregulated process, i.e. whenȳ = ∞, if y(0) > 0 then y(t) > 0 for t > 0 with probability one provided that n ≥ 2, see Karatzas and Shreve (1991) Proposition 3.3.22. 10 Note that the drift and diffusion terms in equation (8) are only functions of y. We also note that the instantaneous return is a function of y, so we can write the following v(y) = min subject to equation (8) when y ∈ [0,ȳ], where τ j are the first time that y(t) hitsȳ. The function v solves: r v(y) = By + nσ 2 v ′ (y) + 2σ 2 y v ′′ (y), for y ∈ (0,ȳ) .
Since policy calls for adjustment at values higher thanȳ we have: If v is differentiable atȳ we can write the two boundary conditions: These conditions are typically referred to as value matching and smooth pasting. For y = 0 to be the optimal return point, it must be a global minimum, and thus we require that: Note the weak inequality, since y is non-negative. The next proposition finds an analytical solution for v in the range of inaction.
Proposition 1. Let σ > 0. The ODE given by equation (10) is solved by the following analytical function: where the coefficients {β i } solve: for any β 0 .
The proof follows by replacing the function in equation (14) into the ODE (10) and matching the coefficients for the powers of y i . By the Cauchy-Hadamard theorem, the power series converges absolutely for all y > 0 since lim i→∞ β i+1 /β i = 0. The next proposition shows that there exists a unique solution of the ODE (10) satisfying the relevant boundary conditions (see Appendix A for the proof).
The next proposition uses a slightly modified version of the verification theorem in Øksendal (2000) to show that value function v and threshold policyȳ that we found in Proposition 2 for the one-dimensional representation indeed characterize the inaction I = {p : ||p|| 2 <ȳ} and control sets C, as well as the value function V for the original n-dimensional problem (see Appendix A for the proof).
Proposition 3. Let v be the solution of the restricted problem in equations (8) and (9).
. This is the solution of the problem described equations (1) and (2).
For completeness we comment on how the n = 1 products and the case of n > 1 perfectly correlated target prices look like. In the case of one product, i.e. n = 1, the solution to V is easily seen to be the sum of a quadratic and of two exponentials where ζ = √ 2r/σ. The constant β and the boundary valueȳ (equal top 2 for n = 1) are chosen to enforce smooth pasting and value matching. Moreover, it is easy to see that in that case v(y) = V ( √ y) solves the ODE in (10) and its boundary conditions. We note that the solution for the n = 1 case and the expression for the approximation forȳ are the same ones derived in Karlin and Taylor (1981) chapter 15, section 3.F and Dixit (1991) expression (11).
In the case of n perfectly correlated target prices the problem has a single state variable after the first adjustment. In this case, in terms of the threshold policy and value function, the problem is identical to the one with only one price. The static return is thus nBp(t) 2 where p(t) is, when uncontrolled, a one dimensional brownian motion. The only difference with the n = 1 problem is that the constant B is multiplied by n, so that B/ψ is proportional to n. The economics is that since the adjustment has the same effectiveness for all products, it is as if each adjustment was cheaper. Note that in the case of the CRS assumption (where ψ = nψ 1 ) the value of the adjustment threshold, and hence the frequency of adjustment, is independent of n. Instead, in terms of the implication for price changes, the problem with perfectly correlated shocks features no small price changes. When adjustment takes place, all products have the same price gap. We finish this section by characterizing the optimal policyȳ in terms of the structural parameters of the model ( ψ B , σ 2 , n, r).
Proposition 4. The optimal threshold is given by a functionȳ = σ 2 r Q( ψ r 2 B σ 2 , n) so that (ii)ȳ is strictly increasing in n andȳ → ∞ as n → ∞, (iii)ȳ is bounded below by 2(n + 2)σ 2 ψ B and as ψ B r 2 (v) the elasticity ofȳ with respect to r and σ 2 satisfy: See Appendix A for the proof. For the case of n = 1 this characterization is slightly more general than the one in Barro (1972), Karlin and Taylor (1981), and in Dixit (1991), but more importantly it holds for any number of products n ≥ 1. 11 Thatȳ is only a function of the ratio ψ/B is apparent from the definition of the sequence problem. That, as stated in part (i),ȳ is strictly increasing in the ratio of the fixed cost to the benefit of adjustment ψ/B is quite intuitive. Item (ii) says that the threshold is increasing in the number of products n. This is because as n increases, equation (8) shows that the drift of y = ||p|| 2 increases, thus ifȳ would stay constant there will be more adjustments per unit of time, and hence higher menu cost will be paid. Additionally, ifȳ remains unchanged, the average cost per unit of time also increases. One can show that the second effect is smaller, and hence an increase in n makes it optimal to increaseȳ. Part (iii) gives an expression for a lower bound forȳ, which becomes arbitrary accurate for either a small value of the cost ψ/B, so that the range of inaction is small, or a small value of the interest rate r, so that the problem is equivalent to minimize the steady state average net cost. We note that in the approximation: the effect of ψσ 2 /B is exactly the same as in the case of one product. Indeed the quartic root (implied for the optimal thresholdp = √ȳ ) is the one obtained by Barro (1972), Karlin and Taylor (1981) and Dixit (1991) in a model with n = 1. Part (iv) shows that the approximation worsens, and that the elasticity ofȳ with respect to ψ/B increases above 1/2 as ψ B r 2 σ 2 becomes large. Note that the approximation in part (iii) implies that the elasticity ofȳ with respect to ψ/B is 1/2 for small values of the ψ/B ratio. Then, using part (v), we obtain thatȳ has elasticity 1/2 with respect to σ 2 and also that it is independent of r. Moreover, for small normalized adjustment cost, i.e. as ψ/(Bσ 2 ) ↓ 0, (iii) and (v) imply that ∂ȳ/∂r → 0, so that interest rates have only second order effects on the range of inaction, a result that we find useful to study the impulse responses to aggregate (i.e. monetary) shocks in Section 5. Finally, despite the result in Part (iv), we found that the quadratic approximation to v(·), which amounts to a quartic approximation to V (·), gives very accurate values forȳ across a very large range of parameters, as documented in Appendix D. What happens is that for any realistic application the values of r and ψ are small relative to Bσ 2 , hence the approximation given in part (iii) applies.

Implications for frequency and size of price changes
In this section we explore the implications for the frequency and distribution of price changes.
We let the expected time for y(t) to hit the barrierȳ starting at y by the function T (y). This function satisfies: 0 = 1 + n σ 2 T ′ (y) + 2 y σ 2 T ′′ (y) for y ∈ (0,ȳ) and T (ȳ) = 0 , where the first condition gives the law of motion inside the range of inaction and the second one imposes the terminal condition on the boundary of the range of inaction. The unique solution of this ODE that satisfies the relevant boundary condition is: Thus T (0) gives the expected time between successive price adjustments, so that the average number of adjustments, denoted by N a is 1 T (0) . We summarize this result in: Let N a be the expected number of price changes for a multi-product firm with n goods. It is given by The second equality in equation (18) uses the function Q(·) derived in Proposition 4, while in the last equality we use the approximation ofȳ for small ψr 2 /(Bσ 2 ) (see Appendix D for more documentation on the accuracy of the approximation). It is interesting that this expression extends the well known expression for the case of n = 1, simply by adjusting the value of the variance from σ 2 to nσ 2 . The number of products n affects N a through two opposing forces. One is that with more products the variance of the deviations of the price gaps increases, and thus a given value ofȳ is hit sooner in expected value. This is the "direct effect". On the other hand, with more products, the optimal value ofȳ is higher. Expression equation (18) shows that, as often happens in these models, the direct effect dominates, and the frequency of adjustment increases with n.
We use this expression to study how the bundling of menu costs, i.e. the fact that a single menu cost relates to several products, affects the frequency of adjustment of individual prices. This is interesting because recent evidence in Bhattarai and Schoenle (2010) shows that the frequency of price adjustment is higher for firms that sell a larger number of goods. 12 Table 1 This pattern is qualitatively consistent with the formula in equation (18), which shows 12 See Figures 1 and 2 in their paper. These authors group firms into 4 bins, according to the number of items sold (and recorded by the BLS), from 1 to 3 goods in the first bin to more than 7 goods in the fourth bin. They first measure the frequency of price changes at the good level, then compute the median frequency across the goods produced in the firm. Finally, they average these medians inside each of the 4 bins.
that N a is increasing in n. Notice however that in this comparison we are keeping ψ constant, so that as n increases the menu cost per good is decreasing. One may wonder whether the increased activity by the firms follows from the fact that the menu cost is smaller (per good) or because of the bundling of the goods prices. To separate the effects of the economies of scale in the menu cost from the bundling of the goods, consider the case where the cost ψ grows linearly with the number of goods n, i.e. : ψ = ψ 1 n. This gives which is also increasing in n, although at a lower rate. Thus, even under "CRS" for the menu cost, the bundling of the goods pricing induces more frequent adjustments than in the case where the menu costs are dissociated, i.e. when n = 1. We explain the economics behind this result for the CRS case. Define N 1 as the optimal number of adjustments per year for a firm selling only 1 good, i.e. with n = 1. Consider a multi-product firm with n > 1 that follows a policy of doing N 1 adjustments per year. Since we are considering the CRS case, the expected amount spent in adjustment per year is the same for both firms: N 1 ψ 1 . But the one good firm tailors all adjustments to those instances where the deviations of the optimal price are large. Instead, due to the fact that all prices are adjusted at the same time, the goods sold by the multiproduct firm will be adjusted sometimes when the price has small deviations, and other times when they have large deviations. Since the profit function is concave in price, it is profitable for the multiproduct firm to increase the number of adjustment to decrease the per-good expected price deviation from its optimal price relative to the firm with only one good. This is exactly what the expression in equation (19) shows for the optimal policy, since this expression is increasing in n. In the case with constant fixed cost the effect is even stronger, since in addition mechanically the cost of adjustment per good decreases. Table 1 uses equation (19) for the case of constant returns to scale (CRS) and equation (18) for the case of constant fixed cost (CFC) to calculate hypothetical values of N a for different values of n. For both cases we have selected the values of B σ 2 /ψ 1 so that its value is 2.4 adjustments per year, the value estimated by Bhattarai and Schoenle (2010) for the US for firms with n = 2. Table 1 also includes a row with US data. Comparing the case of CRS with the one with CFC, the former displays a pattern much closer to the one in the US data. Next we characterize the hazard rate of price adjustments (see Appendix A for the proof) Proposition 6. Let t denote the time elapsed since the last price change. Let J ν (·) be the Bessel function of the first kind. The hazard rate for price changes is given by , and q n,k are the positive zeros of J ν (·), which asymptotes to Proposition 6 compares the asymptote of the hazard rate with the expected time until adjustment, which equals T (0) =ȳ/(nσ 2 ), as derived above. Notice that for a model with a constant hazard these two quantities are the reciprocal of each other, i.e. the expected duration is the reciprocal of the hazard rate. We use the product T (0) lim t→∞ h n (t), which is larger than one, as a measure of how close the model is to have a constant hazard rate, or equivalently as a summary measure of how increasing the hazard function is. Also notice that the expression in Proposition 6 immediately shows that, keeping the expected time until adjustment T (0) fixed, the hazard rate is only a function of n. Thus the shape of the hazard function depends only on the number of products n. Changes in σ 2 , B, ψ only stretch the horizontal axis linearly. For each n the value of σ 2 /ȳ is chosen so that the expected time elapsed between adjustments is one. Figure 1 plots the hazard rate function h for different choices of n keeping the expected time between price adjustment fixed at one. As Proposition 6 shows the function h has an asymptote, which is increasing in the number of products n. Moreover, since the asymptote diverges to ∞ as n increases with no bound, the hazard rate converges to a an inverted L shape, as the one for a model where adjustment are done exactly every T (0) = 1 periods, like in Taylor's (1980) model. To see this note that, definingỹ ≡ y/ȳ and fixing the ratio T (0) =ȳ /(n σ 2 ) so that for any n the expected time elapsed between price changes is T (0), we have: As n → ∞ the process for the normalized size of the price gapỹ described in equation (20) converges to the deterministic one, in which case the hazard rate is zero between times 0 and below T (0) and ∞ precisely at T (0). For completeness, Table 2 computes the first zero -denoted by q n,1 -for the relevant Bessel functions and the (normalized) asymptotic hazard rate for several value of n. first zero of J n 2 −1 (·) : q n,1 1.6 2.4 3.1 3.8 5.1 6.4 7.6 13 30 56 T (0) lim t→∞ h n (t) 1.2 1.4 1.6 1.8 2.2 2.5 2.9 4.5 8.8 16 T (0) is the expected duration and T (0) lim t→∞ h n (t) = q 2 n,1 / (2n) is the normalized limit hazard rate.
The shape of estimated hazard rates varies across studies, but many have found flat or decreasing ones, and some have found hump-shape ones. As can be seen from Figure 1 the hazard rate for the case of n = 1 is increasing but rapidly reaches its asymptote. As n is increased, the shape of the hazard rate becomes closer to the inverted L shape of its limit as n → ∞. For instance, when n = 10 the level of the hazard rate evaluated at the expected duration is about twice as large as the one for n = 2. This is a prediction that can be tested in the cross section using the data set in Bhattarai and Schoenle (2010) or Wulfsberg (2010).
Finally we discuss the distribution of price changes. This distribution is characterized by two parameters: the number of goods n, and the optimal boundary of the inaction setȳ. The value ofȳ, as discussed above, depends on all the parameters. Since after an adjustment price gaps are reset to zero, price changes coincide with the value of p(τ ) ∈ ∂I ⊂ R n , the surface of an n-dimensional sphere of radius √ȳ . Let τ , be a time where y hits the boundary of the range of inaction: then given that each of the (uncontrolled) p i (t) is independently and identically normally distributed, price changes ∆p(τ ) = −p(τ ) are uniformly distributed in the n-dimensional surface of the sphere of radius √ȳ . 13 The next proposition characterizes the marginal distribution of price changes.

Proposition 7.
Let ∆p ∈ ∂I ⊂ R n denote a vector of price changes for the n goods. The distribution of the price change of an individual good, i.e. the marginal distribution of √ȳ ], has density: where Beta(·, ·) denotes the Beta function. The standard deviation and kurtosis of the price changes, the expected value of the absolute value of price changes and its coefficient of variations are given by: As n → ∞ the distribution of ∆p i /Std(∆p i ) converges point-wise to a standard normal.
The proof uses results from the characterization of spherical distributions by Song and Gupta (1997) (see Appendix A for the details). Using the previous proposition and the approximation forȳ we obtain the following expression for the standard deviation of price , where both expressions are decreasing in n. The expression for the kurtosis of the price changes shows that this statistic is an increasing function of n.
Some of the statistics for |∆p i | in Proposition 7 can be approximated to obtain: 14 The expression for the approximate value of E [|∆p i |] is given by Std (∆p i ) times a decreasing 13 The distribution of ∆p(τ ) is uniform in the surface of the sphere. To see this notice that the p.d.f. of a jointly normally distributed vector of n identical and independent normals is given by a constant times the exponential of the square radius of the sphere, divided by half of the common variance.
14 The approximation errors for E [ |∆p i | ] and Std ( |∆p i | ) /E ( |∆p i | ) are smaller than 0.26% and 0.91%. function of n. The expression for the approximate value of Std (|∆p i |) /E (|∆p i |) shows that this statistic is an increasing function of n. We note that the shape of the distribution h for price changes differs substantially for small values of n. For n = 2 is U-shaped, for n = 3 is uniform, for n = 4 it has the shape of a half circle, and for n ≥ 6 it has bell shape. 15 Proposition 7 establishes that when n → ∞ the distribution converges to a normal: this can be seen in Figure 2 by the comparison of the distribution for n = 50 and the p.d.f of a standard normal distribution with standard deviation equal the one obtained when n = 50.  Table 3 computes the size of the price adjustments, measured as E[|∆p|], as a function of n. We do so for the two extreme technologies, the constant returns to scale (CRS) and the constant fixed cost (CFC) case. In each case we fix the value of the parameter Bσ 2 /ψ 1 so that this statistic is 0.085, the value estimated by Bhattarai and Schoenle (2010) in US data. We also report the values estimated for the US for other values of n. Comparing both assumptions, it seems that the US data is somewhere in the middle, but closer to the case of CRS. The better fit of the CRS case for the size of price changes is consistent with the better bit obtained for the frequency of price changes as reported in Table 1.  Furthermore, from the expressions in Proposition 7 the distribution of price changes ∆p, and of their absolute value |∆p| depend only on n andȳ. Thus, any normalized statistics such as ratio of moments (kurtosis, skewness, etc) or a ratio of points in the c.d.f. depends exclusively on n. Indeed the kurtosis is given in Proposition 7, as Kurtosis(∆p i ) = 3n/(2+n), which is an increasing concave function, starting at 1 and converging to 3. Table 4 uses the expressions of the model to compute several moments of interest. These moments have been estimated using two scanner data sets by Midrigan (2011) and also using BLS producer data by Bhattarai and Schoenle (2010). A summary of the selected statistics from these papers is reproduced in Table 5. We briefly comment on the reasons why the statistics chosen in Table 4 with Table 5 are of interest. Note that the case of n = 1, price changes are binomial, either − √ȳ or + √ȳ with the same probability, so its absolute value has a degenerate distribution. As the number of goods increases the dispersion of the absolute value increases. The distribution includes larger price changes, so that its kurtosis also increases with n. As there are more goods, some goods will be adjusted even if their price is almost optimal, and hence the fraction of small price changes increases with n. We draw two conclusions from the comparison of Table 4 with Table 5. First, for the four moments computed our model falls short from the data. In particular, as shown in Proposition 7 the distribution in the model converged to a normal as n goes to ∞. Yet the data displays values for the four moments even larger than the ones corresponding to a standard normal. Second, our model reproduces the pattern of the four moments in terms of their variation with respect to the number of products n.  (2010) data: the number of product n is the mean of the categories considered based on the information in Table 1, the ratio Std(|∆p i |) / E(|∆p i |) is from Table  2 (Firm-Based), the fraction of |∆p i | which are small is from Table 14, the Kurtosis is from Figure 7. The data from Midrigan (2007) are taken from distribution of standardized prices in Table 2a.

Impulse response of a monetary shock
In this section we study the response of the aggregate price level to an unexpected permanent monetary shock. Understanding this impulse response is useful to quantify the real effects of monetary policy shocks in the presence of menu costs, identify its determinants, and how this effect varies with the number of products n sold by the firm. We will show how these determinants map into simple observable statistics about the size and frequency of price changes, that are available for many economies, and provide an analytical characterization of the output effect caused by the monetary shock.
The general equilibrium set up where we embed our price setting problem is an adaptation of the one in Golosov and Lucas (2007) to multiproduct firms. The representative household has preferences given by (see Appendix B for details) is the consumption of product i produced by firm k, ℓ(t) are labor services, M(t) is the nominal quantity of money, P (t) is the nominal ideal price index of one unit of aggregate consumption, and r > 0, ǫ ≥ 1, α > 0, η > 1 are parameters.
The elasticity of substitution between any two products η is the same, regardless of the firms that produced them. The production function for product i in firm k at time t is linear in labor -the only input in the economy-with productivity 1/Z ki (t), so the marginal cost of that product is W (t)Z ik (t), where W (t) is the nominal wage. We assume that the idiosyncratic productivity and demand shocks are perfectly correlated, and that Z ik (t) = exp (σW ki (t)) where W ki are standard BM independent across all i, k. 16 Firm k can adjust one of more of its n nominal prices paying a fixed menu cost equal to a number of labor service units, which we express as ψ times the steady state profits from producing n goods evaluated at the profit maximizing price. Markets are complete, and all firms are owned by the representative household. We use R(t), W (t) and P ik (t) for the time t nominal interest rate, nominal wage, and nominal price of firm k product i respectively. As before we use p ki (t) for the price gap: the log of the ratio of the nominal price of firm k on product i to the frictionless optimal price, or We study an economy that starts at time zero with the invariant distribution of firm's prices that correspond to the steady state with constant money supply equal toM . We assume that at time t = 0 there is an unanticipated permanent increase in the level of the money supply by δ log points, so log M(0) = logM + δ, where we use a bar to denote the initial steady state values. The next proposition gives a characterization of the equilibrium for this economy, and the circumstances under which the aggregation of the decision rules studied so far provides an accurate approximation of the effect of a monetary shock δ in a general equilibrium (see Appendix B for the proof which is standard). We use p ≡ {p i (t)} t≥0,i=1,...,n to denote the stochastic process for the n price gaps, and τ ≡ {τ j } j=1,... the stopping times, for a generic firm that starts with price gap p (so we omit the firm-subindex k). We use c ≡ {c(t)/c−1} t≥0 for the path of aggregate output deviations from the steady state. Finally, we let V 0 (p, τ ) be the (negative of the) expected profits of the firm in equilibrium, and V 0 (p, τ ) the objective function in equation (1) (the choice of signs is to make V 0 comparable to V 0 in the cost minimization problem).
Proposition 8. Let δ ≥ 0. Assume that | d log P (t) dδ | ≤ κ for all t ≥ 0. An equilibrium satisfies for all t ≥ 0: Then the objective function of the firm in a general equilibrium can be written as: where ι(·) does not depend on (p, τ ), and Υ > 0 are the per product maximum (frictionless) nominal profits in steady state (immediately before the monetary shock), a constant that depends on η, α, ǫ,c andW .
The first part of Proposition 8 shows that a permanent increase in the (log of) money of size δ > 0, increases permanently (the log of) nominal wages, and hence marginal cost, by δ at time zero. The effect on output on the other hand is gradual, and at each t it depends on how much the aggregate price level P (t) has risen up to that date. This mechanism is common to other general equilibrium sticky price model, such as Danziger's (1999) or Golosov and Lucas's (2007). The second line of equation (23) then shows that the effect of the shock on P (t) can be approximated by analyzing the response of the price gaps: each coordinate of the vector of price gaps falls by a constant δ, i.e. p ki (0) =p ki − δ, before any adjustment takes place. The second result of the proposition is that for small shocks δ and small adjustment cost ψ, the general equilibrium feedback effect is negligible, so that ignoring the effect of the shock on the firm's decision rules (e.g. on the value ofȳ) gives a good approximation of the firm's behavior during the convergence to the steady state. This aspect of our analysis provides a foundation to Caballero and Engel (1991, 1993 who pioneered the analytical study of the impulse response in Ss models while ignoring the general equilibrium feedback effects on the decision rules. To see why this approximation holds note that, as Golosov and Lucas (2007) remark, the general equilibrium feedback on the decision rules from firms is completely captured by the on the effect of the aggregate output at time c(t) on the period t profits. These authors report that quantitatively the general equilibrium feedback effects on the optimal decision rules of the firms are very small. 17 The approximation displayed in equation (24) in Proposition 8 explains this finding: the objective function V 0 in the partial equilibrium set-up of equation (1) proportional to the objective function in the general equilibrium setup V 0 , and their difference involve third or higher order terms, so that the general equilibrium feedback effect on the form of the inaction sets is second order. The logic of this result is similar to the result in the neo-keynesian model with Calvo pricing that the forward looking Phillips curve does not include interest rate effects; or much closer to our set-up the analysis in Section IV of the sS foundation of a Phillips curve by Gertler and Leahy (2008). Section B.1 in Appendix B contains the proof of our result.
Next, we use the results of Proposition 8 to study the effect of an aggregate monetary shock of size δ on the aggregate price level P (t) at t ≥ 0 periods after the shock, which we denote by P n (δ, t). As commonly used in the sticky price literature, we characterize the first order approximation to the price index, so in particular we study P n (δ, t) ≡ δ + 1 0 ( n i=1 p ki (t)) dk ≈ log P (t)/P . Once we characterized the effect on the price level, we describe the effect on employment and output.
This impulse response is made of two parts: an instantaneous impact adjustment (a jump) of the aggregate price level which occurs at the time of the shock, denoted by Θ n (δ), and a continuous flow of adjustments from t > 0 on, denoted by θ n (δ, t). The cumulative effect of the price level t ≥ 0 periods after the shock is P n (δ, t) = Θ n (δ) + t 0 θ n (δ, s)ds .
We also study the impact effect on the fraction of firms that change prices, denoted by Φ n (δ). We focus on the cumulative price response because its difference with the monetary shock, i.e. δ − P n (t, δ), is proportional to the effect on aggregate output at time t, as discussed in Section 5.5. Next we present our main results on P n (δ, t) and Φ n (δ) following an aggregate shock.
17 Figure 7 in Golosov and Lucas (2007) compares an impulse response that includes the general equilibrium feedback effect with one computed ignoring this effect, i.e. keeping the firms decision rules constant. The authors conclude that "Evidently, the approximation works very well for the effects of a one-time shock, even a large one." In Alvarez, Lippi, and Paciello (2011a) we also solve numerically a model with the general equilibrium structure of Golosov and Lucas but where the idiosyncratic shocks and adjustment cost corresponds to the quadratic case studied in this paper for n = 1 and also find that the general equilibrium feedback effects are very small. Proposition 9. Fix n, the number of goods produced by each firm.
1. Parameters. The impulse response P n (δ, t) depends only on two parameters: √ȳ and σ, which we re-parameterize as functions of two steady state statistics: the standard deviation of price changes Std[∆p] and the frequency of price changes N a .
2. Scaling and Stretching: The IRF of an economy with steady state Std [∆p], N a and a shock δ at horizon t ≥ 0 is a scaled version of the one of an economy with unit steady state parameters, normalized monetary shock δ/Std [∆p], and a stretched horizon N a t: , N a t; 1, 1 3. Impact Effects: the impact effects P(δ, 0) = Θ n (δ) and Φ n (δ) are strictly increasing in δ, they are respectively strictly below δ and 1, in the interval (0, 2 Std[∆p]) and achieve these values outside this interval. Moreover, impact effects are second order on the monetary shock: Θ ′ n (0) = Φ ′ n (0) = 0.
Part 1 of Proposition 9 provides an interesting re-parameterization of the impulse response for three reasons: (i) the steady state statistics Std [∆p] and N a are readily available for actual economies, (ii) the results of Section 3 and Section 4 imply that, even fixing n, one can always choose two parameters values of ψ/B and σ 2 to match these two statistics, and (iii) keeping fixed these two observable statistics and just changing n we can isolate completely the role of the number of products n.
Part 2 of Proposition 9 states a useful "scaling" property of the impulse response function. First notice that at t = 0, the impact effect of a monetary shock Θ n is the same for any two economies with the same steady state average size of price changes Std[∆p], and is independent of the value of the steady state frequency of price adjustment N a . Moreover, for all times following the impact (t > 0) the effect of a monetary shock δ, in an economy characterized by steady state statistics Std [∆p] and N a depend only on n. This means that for a fixed n, the whole profile of the impulse response functions in economies with different values of Std [∆p] and N a are simply scaled version of each other. For instance, fixing n, δ and Std[∆p], the impulse response functions in two economies that differ in the frequency of price adjustments, say N a vs 2 N a , will have exactly the same values of P n but will reach these values at different times, respectively 2 t vs t, i.e. an economy with twice more flexible prices in steady state has an impulse response that reaches each value in half of the time. Furthermore, keeping N a fixed, the height of the whole impulse response function P n is proportional to the scaled value of the monetary shock. We find this characterization interesting in itself, i.e. even interesting for the n = 1 case, but more importantly it will allow us to compare the impulse response for economies that feature different values of n.
Part 3 of Proposition 9 shows that the size of the monetary shock matters, so for large shocks there is instantaneous full price flexibility (Θ n = δ), but for small shocks the size of the initial jump in price is second order compared to the shock. This, together with part 2, implies that whether a monetary shock is large or not is completely characterized by comparing it with the typical price change in steady state, i.e. it is a function of δ/Std [∆p]. Numbe r of Produc t s, n Normalized impact response of the aggregate price level to a permanent shock in the level of money of size δ/Std [∆p]. The normalization in the left panel consists of dividing the impact response of the price level by Std[∆p], the steady state standard deviation of price changes. See the text for more details.
For the reader who is not interested in the derivation of the impulse responses, and an explanation of the different effects behind it, we include two figures that summarize the quantitative conclusions of our analysis. Before getting to these figures, we note that its computation for large value of n would have been extremely costly without the characterization given in the sections below. Recall that if Θ n (δ) = δ the shock is neutral, and that instead when Θ n (δ) < δ the shock implies an increase in real output. As stated in Proposition 9, if δ ≥ 2 ȳ/n = 2 Std[∆p], then all firms adjust prices, and hence the shock is neutral. This explains the range of the normalized shock, between 0 and 2. For the quantification of this figure it is helpful to notice that on the one hand a typical estimate of the standard deviation of price changes for US or European countries is 10% or higher, i.e. Std[∆p] ≈ 0.1. On the other hand to quantify δ note that in a short interval -say a month-changes of the money supply or prices in the order 1% are very rare. 18 This figure also shows that for small δ the aggregate price effects are of order δ 2 , as stated in Proposition 9. Interestingly, the impact response of a monetary shock changes order with respect to n as the value of δ increases, as can be seen for shocks smaller or larger than δ/Std[∆p] ≈ 0.7. Note that using Std[∆p] = 0.1 this means that shock for which they reverse order is higher than 7%, a very large value. The right panel of Figure 3 displays four lines, each corresponding to a different value of δ. Each line shows the aggregate effect on prices as n changes, relative to the n = 1 case. From these two panels it can be seen that, for monetary shocks in the order of those experienced by economies with inflation close to zero, i.e. for increases in money δ/Std[∆p] smaller than a 0.5 (or for the benchmark value, for δ smaller than 5%), economies with more products are more sticky than those with fewer. Figure 4 plots the impulse response function P n (δ, t) for economies with different n keeping fixed the steady state deviation of price changes to 10%, i.e. Std[∆p] = 0.1 and an average of one price change per year, i.e. N a = 1. The size of the monetary shock is 1%, so that δ/Std[∆p] = 0.1. In this figure we have time aggregated the effect on the aggregate price level up to daily periods. As required, all impulse responses display impact effect on the first period, and a monotone convergence to the full adjustment of the shock. The impact effect of the monetary shock during the first periods is to increase prices about 5% of the long run value (i.e. 5 basis points) for n = 1. This effect is smaller in economies where firms produce more products, i.e. the impact at t = 0 is decreasing in n. This difference is small between one and two products, but the effect is almost halved for firms with 10 products, as shown in Figure 3 for a monetary shock of the same size.
Likewise, the shape and duration of the shocks depend on n. The half-life of the shock more than doubles as the number of products goes from one to fifteen. The shape of the impulse responses for n = 1 is quite concave, but for large n it becomes almost linear, up to a value of t of about 1/N a . This pattern of the shape is consistent with the result of Proposition 6 which shows that for large n the model becomes a version of either Taylor (1980) staggering price model or Reis (2006) rational inattention model, where the staggering lasts for T (0) = 1/N a periods. Indeed, in Proposition 13 below we show that as n → ∞ the impulse response becomes linear up to time 1/N a because there is no "selection effect".
Summarizing, we find that extending the analysis from n = 1 to a larger number of product (say n ≈ 10) almost halves the impact effect on the aggregate price level and doubles the half-life of the shock, for empirically reasonable monetary shocks. 19 The rest of this section is organized as follows. First, we obtain a closed form solution for the IRF in the case of n = 1. We think of this interesting in its own right but also helpful to better understand the derivation for the n ≥ 2 case as well as to compare the results. After that, we develop the analytical expressions for the n ≥ 2 case, concentrating first on the impact effects and then in the remaining part of the impulse response. The proof of Proposition 9, as well as explicit expressions for the impulse responses, are presented as separate propositions in the next subsections. We conclude the section by discussing the real effect of the monetary shocks.

Impulse response for the n = 1 case
In the n = 1 case, which, abusing a bit the analogy, we refer to as the Golosov and Lucas (GL) case, the firms controls the price gap between two symmetric thresholds, ±p, and when 19 The results are very similar for shocks of 1/2 and 2 percent, as reported in the Appendix F.3. the price gap hits either of them it returns it to zero. Hence, in the GL case the invariant distribution of price gaps is triangular: the density function has a maximum at the price gap p = 0 and decreases linearly on both sides to reach a value of zero at the thresholdsp and −p, since firms that reach the thresholds will adjust upon a further shock. An example of such a distribution is depicted by the solid line in the left panel Figure 5. A straightforward computation gives that the slope of this density is ±(1/p) 2 . Consider an aggregate shock that displaces the distribution by reducing all price gaps by δ. If the value of δ > 2p then all the firms will adjust their price, so that Φ = 1, and after a simple calculation one can see that the aggregate price level is increased by δ. Instead, if the value of δ is smaller than 2p, only the firms with a sufficiently small price gap will adjust. Denoting the price gap right after the shock by p 0 , these are the firms that end up with p 0 < −p. The density of the distribution of the price gaps immediately after the shock, denoted by λ, is depicted by the dotted line of Figure 5 and it is given by: For a shock of size δ the mass of such firms is Φ = (1/2)(δ/p) 2 , which uses the slope of the density given above (to simplify notation we suppress the n = 1 subindex). Note that the magnitude of this fraction is proportional to the square of the shock, a feature that is due to the fact that there are a few firms close to the boundary of the inaction set. This case is depicted by the dotted line in the left panel of Figure 5. Firms that change prices "close the price gap" completely, so that price increase will be δ +p for the firm that prior the shock had price gap −p and it will be equal top for the firm with pre-shock price gap equal to −p + δ. Using the triangular distribution of price gaps we have that the average price increase among those that adjust prices equalsp + δ/3. Let's denote by Θ the impact effect on aggregate prices of a monetary shock of size δ, the product of the number of firms that adjust times the average adjustment among them. Note that in steady state the average size of price changes, as measured by the standard deviation of price so that for an economy with one good, the impact effect on prices, normalized by the steady state average price change, depends on the normalized monetary shock, and it is locally quadratic, at least for a small shock. Note that the degree of aggregate stickiness is indepen-dent of the steady state fraction of price changes. We now develop expression for the impulse response at horizons t > 0. The density of the price gaps p 0 right after the monetary shock δ is the displaced, triangular distribution λ is displayed in Figure 5 and described in equation (26), and hence it hasp as a parameter. It peaks at −δ, it has support [−p − δ,p − δ]. Note that the impact adjustment is concentrated on the firms whose price gap is smaller than −p. Now consider the contribution to the change in aggregate prices of the firms whose price gap is p 0 ∈ [−p,p − δ], so they have not adjusted on impact, and of which there are λ(p 0 ; δ) dp 0 . Let G − (t; p 0 ) be the probability that a firm with price gap p 0 at time zero will increase price before time t, i.e. the probability that its price gap will hit −p before time t without first hittingp. Likewise define G + (t; p 0 , ) as the corresponding probability of a price decrease, let G(t; p 0 ) = G − (t; p 0 ) − G + (t; p 0 ) the difference between these probabilities, and we let g be its density. We note that these function have (p, σ 2 ) as parameters. We can now define the contribution to the change in the price level of the adjustments that take place between t and t + dt as: The integral excludes the initial price gaps p 0 that are below −p. These correspond to firms that adjusted on impact. Note that θ(δ, t) have (p, σ 2 ) as parameters. Expressions for the densities of g + and g − can be found in equations (15)-(16) of Kolkiewicz (2002). This gives Four remarks are in order. First, by substituting our expressions for g and λ we have a closed form solution for each expression in equation (28). Second, note that we did not need to compute the evolution of the whole cross section distribution. Instead, we just follow each firm until the first time that it adjusted its price. This is because the subsequent adjustments have a zero net contribution to aggregate prices, since after the adjustment every firm price gap returns to zero, and the subsequent adjustments are as likely to be increases as decreases. Third, note that the role of the monetary shock is just to displace the initial distribution, i.e. δ is not an argument of g. Fourth, note that this function has two interesting forms of homogeneity. The first type of homogeneity is that it is homogenous of degree one in σ,p and δ. This follows because scalingp and δ will just scale proportionally the distribution λ of the initial price gaps. Furthermore, scalingp and σ keeps the probabilities of hitting any two scaled up values in the same elapsed time to be the same. The second type of homogeneity uses that a standard Brownian Motion at time t started at time zero has a normal distribution with variance t. So scaling the variance of the shock, the price gaps will hit any given value in a scaled time. These two homogeneity properties can be seen by integrating the previous expression gives an IRF which satisfies the properties stated in Proposition 9: ; 1 dx

Invariant distribution of y = ||p|| 2
Here we study the invariant distribution of the sum of the squares of the price gaps y ≡ ||p|| 2 = n i=1 p 2 i (t) under the optimal policy. This will be used to describe the starting point of the economy before the monetary shock. We will denote the density of the invariant distribution by f (y) for y ∈ [0,ȳ]. This is interesting to study the response of firms that are in the steady state to an unexpected shock to their target that displaces the price gaps uniformly. The density of the invariant distribution for y is found by solving the corresponding forward Kolmogorov equation, and the relevant boundary conditions (see Appendix A for the proof).
The density has a peak at y = 0, decreases in y, and reaches zero atȳ. The shape depends on n. The density is convex in y for n = 1, 2, 3, linear for n = 4, and concave for n ≥ 5. This is intuitive, since the drift of the process for y increases linearly with n, hence the mass accumulates closer to the upper boundȳ as n increases. Indeed as n → ∞ the distribution converges to a uniform in [0,ȳ]. Proposition 10 makes clear also that the shape of the invariant density depends exclusively on n, the value of the other parameters, ψ, B, σ 2 only enters in determiningȳ, which only stretches the horizontal axis proportionally.

Impact response in the n ≥ 2 case
Now we turn to studying the economy-wide impact effect of the aggregate shock. To find out what is the fraction of firms that will adjust prices under the invariant we need to characterize some features of the invariant distribution of p ∈ R n . We assume that the aggregate shock happens once and for all, so that the price gap process remains the same and the firms solve the problem stated above. First we find out which firms choose to change prices and, averaging among their n products, by how much. A firm with price gap p ∈ R n and state ||p|| 2 = y ≤ȳ before the shock, will have its price gaps displaced down by δ in each of its n goods, i.e. its state immediately after the shock is ||p − 1 n δ||, where 1 n is a vector of ones. This firm will change its prices if and only if the state will fall outside the range of inaction, i.e. ||p − 1 n δ|| ≥ȳ, or equivalently if and only if: Thus ν(y, δ) gives the highest value for the sum of the n price gaps for which a firm with state y will adjust the price. The normalized sum of price gaps n i=1 p i / √ y takes values on The right panel of Figure 5 shows the n = 2 case by plotting a circle centered at zero that contains all the pre-shock price gap, and showing the "displaced" price gaps right after the δ shock, which are given by a circle centered at (−δ, δ). The (red) shaded area contains all the price gaps of the firms that, after the shock, will find it optimal to adjust their prices, i.e. firms for which equation (31) holds.
A firm whose price gap p satisfies equation (31), i.e. one with (1/ √ y) n i=1 p i ≤ ν(y, δ), will change all its prices. The mean price change, averaging across its n products, is δ − (1/n) n i=1 p i . 20 Thus we can determine the fraction of firms that change its prices, and the amount by which they change them, analyzing the invariant distribution of the squared price gaps, f (y). Let S(z) denote the cumulative distribution function of the sum of the coordinates of the vectors distributed uniformly in the n dimensional unit sphere. Formally we define S : R → [0, 1] as where S n is the n−dimensional sphere and where L denotes its n − 1 Lebesgue measure.
Note that S(·) is weakly increasing, that 0 = S (− √ n), S(0) = 1/2, S ( √ n) = 1 and that it is strictly increasing for z ∈ (− √ n, √ n). Remarkably, as shown in Proposition 11, the distribution of the sum of the coordinates of a uniform random variable in the unit ndimensional sphere is the same, up to a scale, than the marginal distribution of any of the coordinates of a uniform random variable in the unit n-dimensional sphere (which we discussed in Proposition 7), i.e. the c.d.f. satisfies: for n ≥ 2, and for n = 1 the c.d.f. S has two points with mass 1/2 at −1 and at +1. Now we are ready to give expressions for the effect of an aggregate shock δ. First consider Φ n , the fraction of firms that adjust prices. There are f (y)dy firms with state y in the invariant distribution; among them the fraction S(ν(y, δ)) adjusts. Integrating across all the values of y we obtain the desired expression. Second, consider Θ n , the change in the price level across all firms. There are f (y)dy firms with state y in the invariant distribution; among them we consider all the firms with normalized sum of price gaps less than ν(y, δ), for which the fraction s(z)dz adjust prices by δ − √ y z/n. Considering all the values of y we obtain the relevant expression. This gives: Proposition 11. Consider an aggregate shock of size δ. The fraction of price changes on impact, Φ n , and the average price change across the n goods among all the firms in the economy, Θ n , are given by: where s(·) is given by equation (32) which depends on n, and where f (·) and ν(·), which are also functions ofȳ and n, are given in equation (30) and equation (31) respectively.
See Appendix A for the proof. Appendix F gives a closed form solution and the numerical evaluation of equation (33) and equation (34), and a lemma with the analytical characterization of Θ n and Φ n stated in part 3 of Proposition 9.

Impulse response at horizons t > 0 in the n ≥ 2 case
We develop an expression for the impulse responses at horizon t > 0 for the general case of n ≥ 1, in particular we derive an expression for the flow impact on the price level at horizon t which we denote as θ n (δ, t). As in the case of one good, we start by describing the distribution of firms indexed by their price gaps, immediately after the monetary shock δ but before any adjustment takes place. The cdf Λ n (p 0 ) gives the fraction of price gaps smaller or equal to p 0 at time zero right after the impact adjustment caused by the monetary shock δ.
Note that Λ n (p 0 ) ≤ Θ n (δ) for all p 0 . To understand this expression, letp 0 ∈ R n be the price gap before the monetary shock, which has y = ||p 0 || 2 distributed according to the density f (y) described in equation (30). The price gaps with a given value of ||p 0 || 2 ≡ y ≤ȳ have a uniform distribution on the sphere, so its density depends only of ||p 0 || 2 , and integrates to the area of the sphere with square radius y. The surface area of this sphere is given by 2π n/2 y (n−1)/2 Γ(n/2). Right after the monetary shock these price gaps become p 0 =p 0 − 1 n δ, where 1 n is an n dimensional vector of ones. So we have that the density of the distribution of the price gaps immediately after the monetary shock, but before any adjustment is λ(p 0 , δ) = f ||p 0 + 1 n δ|| 2 Γ (n/2) 2 π n/2 ||p 0 + 1 n δ|| n−1 (35) and recall that f (y) = 0 for any y >ȳ. We note that λ is a function ofȳ and δ, but it is independent of σ 2 . The next step is to find the contribution of those firms with price gap p 0 to the change in aggregate prices at horizon t. As in the case of one good, it suffices to consider the contribution of those firms that have the first price change exactly at t. This is because all the subsequent adjustment have a zero net contribution to prices, since after the adjustment the firm start with a zero price gap. Since firms will adjust a price when the square radius of the price gap vector first reachȳ at time t, we use the distribution of the corresponding hitting times and place in the sphere. In particular let G(p; t, p 0 ), the probability that if a firm has a price gap p 0 at time zero, it will hit the surface of a sphere of radius √ȳ at time t or before, with a price gap smaller or equal than p. Note that G is a function of σ 2 and y but it is independent of δ. Explicit expressions for the joint density g of the hitting time t and place p can be found in Wendel (1980) and Yin and Wang (2009). When the price gap of the firm hits the sphere of radius √ȳ with a price gap p, the average change of its n prices is given by "closing" each of the n price gaps, i.e. the average price change is given by −(p 1 + ... + p n )/n. Thus the contribution to the change in aggregate prices at time t after a shock δ at time zero is given by Note that the outer integral is computed only for the firms that have not adjusted on impact, i.e. for the price gaps ||p 0 || 2 ≤ȳ. Given the knowledge of the closed form expressions for both λ and g we can compute the multidimensional integrals in θ n (δ, t) by Monte Carlo. We adapt the expression for the density g of hitting times and places in Theorem 3.1 of Yin and Wang (2009) to the case of a BM with variance σ 2 . Using the expression for the surface area of an n dimensional sphere into equation (36) we obtain: Proposition 12. Fix n ≥ 2, then the impulse response can be written as where the coefficients q m,k are the ordered (positive) zeroes of the Bessel function J m+ n 2 −1 (·). The coefficients e m,k (·, ·, n) are functions homogeneous of degree one in (δ, √ȳ ) and do not See Appendix A for the proof. Using the properties of Θ n from Proposition 11, and the homogeneity property of e m,k from Proposition 12, in equation (37) one verifies part 1 and part 2 of Proposition 9. 21 Figure 4 displays some impulse responses for P n (δ, ·). The figure uses equation (36) and is computed by Monte Carlo integration. 22 For each firm i we draw a path of the price gap for a discrete time approximation to the n dimensional BM's. We stop this path the first time τ i at which ||p i (τ i )|| 2 ≥ȳ. We then have the following estimate of θ n : We obtain an estimate of the value of P n by adding the corresponding values ofθ n (δ, s).
We end this section with a full characterization of the impulse response function in the limit case in which n → ∞.
See Appendix A for the proof. The proposition shows that when n is large the impulse response takes a linear form, identical to the exogenous staggering model of Taylor (1980) 21 Note that the coefficient of the exponentials can be written as the product of q 2 m,k /(2 n) and N a t. Table 2 reports the smallest zeroes for each value of m + n/2 − 1.
22 In particular we start with a (large number) K of firms whose price gap p i 0 are drawn from the distribution λ(·, δ). For this we first draw y i from the known distribution f (·), then draw an n dimensional vector Z i of independent normals and take p i 0 = √ y i ||Z i || Z i . Alternatively the computations might use equation (37), which in practice also involves numerical integration. and to the model of Reis (2006), where the staggering emerges from the optimal choice of adjustment subject to costly information gathering. The impact effect on the price level is of the order δ 3 , and hence for small values of δ it is negligible compared to the the impact for the n = 1 case, i.e. Θ n /Θ 1 ↓ 0 for δ ↓ 0, as shown in Figure 3. Moreover, Figure 4 shows that the half life of the shock is (1/2 − (δ/Std(∆p)) 2 ) /N a , which converges to 1/(2N a ) for small shocks. In Figure 4, describing a 1% shock in the money supply, the half life is three times greater than the one produced by the n = 1 case. A main consequence of the large n is that there is no selection effect. This is to be compared with the case of n = 1 where the selection effect is strongest and where, in the periods right after the shock (small t), all price adjustments are price increases. The reason for the lack of selection when n is large is that for a firm selling many products there are, upon adjustment, many cancellations since some prices will be increased and others decreased, so that the average price change across the firms' good is simply δ.

On the output effect of monetary shocks
This section discusses how the impulse response for prices are informative about the interpretation, time-profile and size of the output effect of a monetary shock. In the general equilibrium set-ups discussed at the beginning of this section the deviation of output from its steady state value is proportional to the deviation of the real balances, δ − P n (δ, t), as shown in equation (23) and common to the models of Golosov and Lucas (2007); Caplin and Leahy (1997); Danziger (1999). From now on we refer to δ − P n (δ, t) as to the impulse response of output, which is the expression predicted by our model in the case of log preferences (ǫ = 1). 23 The half life of the output response is identical to the half life of the price level only in the case in which Θ n = 0, i.e. when there is no jump of the price level on impact, a condition that holds for infinitesimal shocks. When the price level jumps on impact (Θ n > 0), the half-life of the output response is longer than the half-life of the price level. The reason is that the jump shortens the time required for the price level to reach its half-life value (i.e. δ/2), whereas the half-life target of the real output effect, given by ( δ − Θ n (δ) )/2 shifts, and so its half life is longer. To picture this effect in Figure 4, notice that different impact levels (corresponding to e.g. different values of n) do not shift the half-life line (whose position is at 0.5%), but will shift the half life line of the real-output effect (this line is not drawn in the figure, it is above 0.5 and shifts up as Θ n increases). The impact effect on output also depends on the size of the shock: on one hand for very large shocks there is full price flexibility and hence no effect on output regardless of n; on the other hand for small monetary 23 If ǫ = 1 then the effect on output should be divided by ǫ, as shown in equation (23).
shocks the impact effect on prices is of order smaller than δ, and hence the impact effect on output is approximately δ for all values of n.
As a summary statistic of the real effect of monetary shock we use the area under the impulse response for output, i.e. M n (δ) = ∞ 0 (δ − P n (δ, t)) dt which can be interpreted as the cumulative effect on output following the shock. This measure combines the size of the output deviations from the steady state with the duration of these deviations. Since P n (δ, t) depends only on the parameters Std(∆p) and N a , so does M n (δ).
Because of the homogeneity of P n (δ, t) discussed in part 2 of Proposition 9, and the way time (N a t) enters P n (δ, t) shown in Proposition 12, we can thus write so that the effect of a shock of size δ in an economy characterized by parameters Std(∆p) and N a can be readily computed using the "normalized" effect for an economy with unit parameters and a standardized shock. The determinants of the real effects of monetary shocks identified by equation (38) offer a new insight to measure the degree of aggregate price stickiness in menu cost models. The previous literature has focused almost exclusively on the frequency of price changes, N a , as a measure of stickiness, and hence of the effect of monetary policy. But equation (38) shows that the dispersion of price changes, Std(∆), is an equally important determinant. Indeed the area under the impulse response of output is proportional to the ratio of these two quantities, where the constant of proportionality depends on the (normalized) size of the monetary shock, δ/Std(∆p) and, in our set-up, on the number of products n. Figure 7 illustrates how the real output effect of a monetary shock varies with the size of the shock (δ) and the number of goods sold by the firm (n). The figure plots the summary impact measure as a function of δ for an economy with Std(∆p) = 0.10 and N a = 1, for four values of n. It is shown that for each value of n the cumulative real effect of a monetary shock is hump-shaped in the size of the shock (δ). The effect is nil at extremes, i.e. at δ = 0 and at 2 Std(∆p) (not shown), as a reflection of the fact that large shocks induce full price flexibility (see part 3 of Proposition 9). The picture shows that the real effect is maximum for a shock size δ that is about 1/2 of the standard deviation of prices. More interestingly, for the purpose of this paper, the size of the real effects varies with the number of goods n.
Larger values of n, i.e. firms selling more goods, produce larger cumulative effects for small values of the shock and also larger maximum values of the effect. In this sense the stickiness of the economy is increasing in n. The maximum cumulative effect on output, in the order of 1.5% output points, is obtained as n → ∞, a similar value though obtains already for n = 10. On the other hand, smaller effects are produced in models with n = 1 or n = 2.

Concluding remarks
We solved analytically a stylized model of price setting, giving a full characterization of the steady state predictions as well as of the economy's aggregate response to a once and for all unexpected monetary shock. A summary of the main findings was given in the Introduction. We conclude with a comment on two extensions.
The problem assumed a zero inflation, that all goods entered symmetrically and that the shocks to the price gaps of the different goods were uncorrelated. These assumptions deliver a spherical problem that turns out to be analytically tractable. Appendix E shows that the zero inflation assumption provide a good approximation to the true rules for inflation rates that are small relative to the variance of idiosyncratic shocks, an assumption that seems appropriate for developed economies. We think that relaxing the independence of the shocks is both interesting and realistic. We leave this much more challenging task for future work. Second, the maximum level of kurtosis for the distribution of price changes predicted by the model is 3, as in the Normal distribution, obtained for a firm selling a large number of goods. This value is larger compared to the prediction of the classical Barro's (1972) or Dixit's (1991) menu cost models (where kurtosis is 1), but it is still small compared to the large excess kurtosis that are measured in micro datasets. Larger values of the kurtosis can be obtained by introducing the possibility of random adjustment opportunities, as in models where the size of the menu cost is stochastic. We explore this problem in Alvarez, Le Bihan, and Lippi (2011) and show that this assumption improves the empirical fit of the model cross section to the micro data, and it reduces the "selection" effect in the event of a monetary shock.

A Proofs
Lemma 1. The origin is the optimal return point.
Proof. (of Lemma 1 ) By contradiction, suppose that it is not and assume without loss of generality that t = 0 is a period where an adjustment takes place and thatp i > 0. Call this "policy A". Then, consider an alternative plan wherep ′ i = 0 and where I ′ = I + {(0, 0, ..., −p i , ..., 0)} so that the next adjustment happens exactly with the probabilities dictated by policy A. Letting τ be the next stopping time, then for 0 ≤ t ≤ τ we have i , and thus settingp i > 0 shows that policy A is not optimal.
Proof. (of Proposition 2 ) Notice that v ′ (0) = β 1 and that v(0) = β 0 , so that we require β 1 > 0, which implies β 0 > 0. Moreover, if β 1 > B/r then v is strictly increasing and strictly convex. If β 1 = B/r then v is linear in y. If 0 < β 1 < B/r, then v is strictly increasing at the origin, strictly concave, and it reaches its unique maximum at a finite value of y. Thus, a solution that satisfies smooth pasting requires that 0 < β 1 < B/r, and the maximizer isȳ.
In this case, y = 0 achieves the minimum in the range [0,ȳ]. Thus we have verified i), ii) and iii).
Next we prove uniqueness. Let β i (β 1 ) be the solution of equation (15), as a function of β 1 . Note that for 0 < β 1 < B/r, all the β i (β 1 ) < 0 for i ≥ 2 and are increasing in β 1 , converging to zero as β 1 goes to B/r. Smooth pasting can be written as where the notation emphasizes that all the β i can be written as a function of β 1 . From the properties of the β i (·) discussed above it follows that we can write the unique solution of 0 = v ′ (ρ(β 1 ); β 1 ) as a strictly increasing function of β 1 , i.e.ρ ′ (β 1 ) > 0. The value matching condition atȳ gives: We note that, given the properties of β i (·) discussed above, for any given y > 0 we have that v(y, β 1 ) − β 0 (β 1 ) is strictly increasing in β 1 , as long as 0 < β 1 < B/r. Thus, define From the properties discussed above we have that Ψ(β 1 ) is strictly increasing in β 1 and that it ranges from 0 to ∞ as β 1 ranges from 0 to B/r. Thus Ψ is invertible. The solution of the problem is given by setting: β 1 (ψ) = Ψ −1 (ψ) andȳ(ψ) =ρ(β 1 (ψ)) .
Proof. (of Proposition 3) We establish the following properties of V (p) = v(y): 1. It only depends on the absolute value of the prices, since for all p ∈ R n :

The range of inaction is given by
3. It solves the ODE given by equation (3). This can be seen by computing: replacing this into the ODE equation (3) we obtain the ODE equation (10), which v solves by hypothesis.
4. It satisfies value matching equation (4), which is immediate since it satisfied the value matching condition for v given in equation (11).
5. It satisfies smooth pasting equation (6). Using the form of the solution for v, namely: Using that v satisfies smooth pasting we have: 0 = n j=1 β j j ( n k=1 p 2 k ) j−1 for any p with n k=1 p 2 k =ȳ, which establishes that V i (p) = 0 for all i = 1, .., n and for any p ∈ ∂I.
In Appendix C we show that a function V (p) with these properties is a strong solution to the variational inequality of the problem, and hence it is the value function.
Proof. (of Proposition 4 ) Using the expression for {β i } obtained in Proposition 1 value matching and smooth pasting can be written as two equations in β 2 andȳ: . This gives an implicit equation forȳ: Since the right hand side of equation (39) is strictly increasing inȳ, and goes from zero to infinity, then we obtain Part (i). Since the right hand side of equation (39) is strictly decreasing in n, and goes to zero as n → ∞, then we obtain Part (ii).
Rearranging this equation and defining z =ȳ r/σ 2 ψ 2(n + 2) where ω i = i s=1 1 (s+2)(n+2s+2) . Using the expression for ω i and co llecting terms on z i one can show that the square bracket of equation (40) that multiplies z 3 is negative, and hencē y > ψ 2(n + 2)σ 2 /B. Letting b = ψr 2 2(n + 2)/(Bσ 2 ) we can write equation (40) as: Since z ↓ 0 as b ↓ 0, then z 2 /b ↓ 1 as b ↓ 0, establishing Part (iii). Inspection of equation (41) reveals that as b ↑ ∞, then z 2 /b ↑ ∞ since the term in the round parenthesis that multiplies z 2 /b on the right hand side goes to zero. Let Ω(z) denote this term. Taking logs on both sides of equation (41) and differentiating with respect to b gives ∂ log z ∂ log b = 2 + ∂ log Ω(z) Noting that Ω(z) > 0 and that it is decreasing in z implies that the elasticity is ∂ log z ∂ log b is increasing in b. This establishes Part (iv). From equation (40) it is clear that the optimal threshold satisfiesȳ = σ 2 r Q ψ Bσ 2 r 2 , n . Differentiating this expression we obtain Part (v).
Proof. (of Proposition 6 ) The proof uses probability theory results on the first passage time of an n-dimensional brownian motion by Ciesielski and Taylor (1962), and a characterization of the zeros of the Bessel function by Hethcote (1970). Let τ be the stopping time defined by the first time when ||p(τ )|| 2 reaches the critical valueȳ, starting at ||p(0)|| = 0 at time zero. Let S n (t,ȳ) be the probability distribution for times t ≥ τ , alternatively let S n (·,ȳ) be the survival function. Theorem 2 in Ciesielski and Taylor (1962) shows that for n ≥ 1: where J ν (z) is the Bessel function of the first kind, where ν = (n − 2)/2, where q n,k are the positive zeros of the Bessel function J ν (z), indexed in ascending order according to k, and where Γ is the gamma function. The hazard rate is then given by: Hethcote (1970) provides a lower bound for the zeroes of the Bessel function q n,k for n ≥ 2, given q 2 n,k > k − 1 4 2 π 2 + n 2 − 1 2 .
Proof. (of Proposition 7 ) We first establish the following Lemma.

Lemma 2.
Let z be distributed uniformly on the surface of the n-dimensional sphere of radius one. We use x for the projection of z in any of the dimension, so z i = x ∈ [−1, 1]. The marginal distribution of x = z i has density: where the Γ function makes the density integrate to one.
This lemma is an application of Theorem 2.1, part 1 in Song and Gupta (1997), setting p = 2, so it is euclidian norm, and k = 1 so it is the marginal of one dimension. We give a simpler proof below. Now we consider the case where the sphere has radius different from one. Let p ∈ ∂I, then p = in the n dimensional sphere of radius one. Thus each p i has the same distribution than x √ȳ .
Using the change of variable formula we obtain the required result. Part 2 of Theorem 2.1 in in Song and Gupta (1997) shows that if x the marginal of a uniform distributed vector in the surface of the n-dimensional sphere, then x 2 is distributed as a Beta( 1 2 , n−1 2 ). If y is distributed as a Beta(α, β) then it has E(y) = α/(α + β) and E(y 2 ) = (α + 1)/(α + β + 1)E(y). Using these expressions for α = 1/2 and β = (n − 1)/2 we obtain the results for the standard deviation of ∆p i and its kurtosis. For the expected value of the absolute value of price changes we note that where we use equation (21) for the density w(·) and, in the last equality, the following result: Then, using the fundamental property of the Gamma function, we have We can approximate these ratio of Gamma functions as which we obtain our expression. For Std ( |∆p i | ) / E ( |∆p i | ) we use that, given the symmetry around zero we have: For the convergence of ∆p i /Std(∆p i ) to a normal, we show that y = x 2 n converges to a chi-square distribution with 1 d.o.f., where x is the marginal of a uniform distribution in the surface of the n-dimensional sphere. The p.d.f of y ∈ [0, n], the square of the standardized . Then, fixing y, taking logs in the ratio of the two p.d.f.'s, and taking the limit as n → ∞, using that √ n → 1 as n → ∞ we obtain that the ratio of the two p.d.f.'s converges to one.
Proof. (of Proposition 10 ) The forward Kolmogorov equation is: with boundary conditions: The first boundary conditions ensures that f is a density. The second is implied by the fact that when the process reachesȳ it is returned to the origin, so the mass escapes from this point. Equation (45) implies the second order ODE: f ′ (y)( n 2 − 2) = yf ′′ (y). The solution of this ODE for n = 2 is f (y) = A 1 y n/2−1 + A 0 for two constants A 0 , A 1 to be determined using the boundary conditions equation (45): 0 = A 1 (ȳ) n/2−1 + A 0 , 1 = A 1 n/2 (ȳ) n/2 + A 0ȳ . For n = 2 the solution is f (y) = −A 1 log(y) + A 0 subject to the analogous conditions. Solving for the coefficients A 0 , A 1 gives the desired expressions.
Proof. (of Proposition 11) The only result to be established is that the distribution of the sum of the coordinates of a vector uniformly distributed in the n−dimensional sphere has density given by equation (32). Using the result in page 387 of Khokhlov (2006), let c : R → R be measurable, and let L be the Lebesgue measure in n dimensional sphere, then x∈R n ,||x||=1 where Z ν m (x) are the Gegenbauer polynomials of degree m and ν, where ∠p0p 0 ≡ (p · p 0 ) / (||p|| ||p 0 ||) is the angle between p 0 and p, where q m,k is the k-th (ordered) zero of the Bessel function J m+ n 2 −1 (·), and where J ′ m+ n 2 −1 (·) is the derivative of the Bessel function. The expression in Proposition 12 follows from integrating the right hand side of equation (46) with respect to t, and thus the coefficients e m,k are given by e m,k (δ, √ȳ , n) = ̺ m,k (δ, √ȳ , n)ȳ 2/ q 2 m,k . Using this expression, it is immediate that the homogeneity of degree one of e m,k (·, n) is equivalent to the homogeneity of degree −1 of ̺ m,k (·, n). To show this homogeneity we first prove two properties: i) Writing λ (p 0 , δ, √ȳ ), where we includeȳ as an argument, since it is an argument of f , see equation (30). Direct computation on equation (35) gives λ (p 0 , aδ, a √ȳ ) = λ (p 0 /a, δ, √ȳ ) /a n+1 for any a > 0. ii) Direct computation on ̟ m,k gives the following: ̟ m,k (p, p 0 , a √ȳ , n) = ̟ m,k (p/a, p 0 /a, √ȳ , n)/a n+1 for any a > 0. Using i) and ii) into the expression for ̺ m,k and the change of variables p ′ 0 = p 0 /a and p ′ = p/a, and that the determinant of the Jacobian in each of the two integrals is a n , we obtain the desired result, i.e. the −1 homogeneity of ̺ m,k .
The derivation of the results for n = 2 follows exactly the same steps but it uses the expression (3.2) in Theorem 3.1 of Yin and Wang (2009) into equation (36). Appendix F.4 displays the exact expressions.
Proof. (of Proposition 13). Letỹ ≡ y/ȳ be the initial values under the invariant distribution f displayed in equation (30), and letỹ(δ) be value that correspond to the same price gaps right after the monetary shock but before any adjustment has taken place. Thus forỹ there is a price gap vector p for whichỹ = ||p|| 2 /ȳ and hence for this price gap vector y(δ) = ||p − δ1 n || 2 /ȳ. Take a value of y ∈ (0,ȳ), developing the square in the expression for the corresponding value ofỹ(δ), multiplying and dividing the second term by √ y, and using the definition ofỹ we have: Using Std(∆p) = ȳ/n we write: Conditional onỹ, we can regardỹ(δ) as a random variable, whose realizations correspond to each of the price gaps with ||p|| 2 /ȳ =ỹ, and where the price gaps p are uniformly distributed on the sphere with square radius y. Proposition 11 gives the density of the random variable n i=1 p i √ y , and using Proposition 7 it follows that for all n its standard deviation is equal to one and its expected value equal to zero. Thus where the convergence to a (degenerate) random variable is on distribution.
Proposition 10 shows that as n → ∞ the distribution ofỹ converges to a uniform distribution in [0,1]. Combining this result with equation (48)  . Immediately after the monetary shock any firm with y >ȳ, or equivalently any firm withỹ(δ) > 1, adjust its prices. From here we see that the fraction of firms that adjusts immediately after the shock, denoted by Φ n , converges to (δ/Std(∆p)) 2 .
To characterize P n for t ≥ 0 we establish three properties: i) the expected price change conditional on adjusting at time t = 0 is equal to δ, ii) the fraction of firms that adjusts for the first time after the shock between 0 and t < [δ − Θ n (δ)] / [δ N a ] equals N a t, and iii) the expected price change conditional on adjusting at time 0 ≤ t < [δ − Θ n (δ)] / [δ N a ] is equal to δ. To establish i), note that, as argued above, as n → ∞ firms adjust its price if and only if they have a price gap p before the monetary shock with square radius larger than 1 − (δ/Std(∆p)) 2 . Since in the invariant distribution price gaps are uniformly distributed on each of the spheres, the expected price change across the firms with the same value of y equals to δ. To establish ii) note that, keeping constant N a as n becomes large, the law of motion forỹ in equation (20) converges to a deterministic one, namelyỹ t =ỹ 0 + N a t. This, together with the uniform distribution forỹ 0 implies the desired result. Finally, iii) follows from combining i) and ii).

ADDITIONAL MATERIAL (NOT FOR PUBLICATION)
Price setting with menu cost for multi-product firms

B General Equilibrium Set-Up
This appendix outlines the general equilibrium set-up that underlies our approximation. The preferences of the representative agents are given by: where c(t) is an aggregate of the goods produced by all firms, ℓ(t) is the labor supply, m(t) the nominal quantity of money, and P (t) the nominal price of one unit of consumption, formally defined below (all variables at time t). We will use U(c) = (c 1−ǫ − 1)/(1 − ǫ). There is a unit mass of firms, index by k ∈ [0, 1], and each of them produces n goods, index by i = 1, ..., n. There is a preference shock A k,i (t) associated with good i produced by firm k at time t, which acts as a multiplicative shifter of the demand of each good i. Let c k,i (t) be the consumption of the product i produced by firm k at time t. The composite Dixit-Stiglitz consumption good c is For firm k to produce y k,i (t) of the i good at time t requires ℓ k,i (t) = y k,i (t)Z k,i (t) units of labor, so that W (t)Z k,i (t) is the marginal cost of production. We assume that A k,i (t) = Z k,i (t) η−1 so the (log) of marginal cost and the demand shock are perfectly correlated. We assume that Z k,i (t) = exp (σW k,i (t)) where W k,i are standard BM's, independent across all i, k.
The budget constraint of the representative agent is where R(t) is the nominal interest rates, Q(t) = exp − t 0 R(s)ds the price of a nominal bond, W (t) the nominal wage, τ (t) the lump sum nominal transfers, τ ℓ a constant labor subsidy rate, andΠ(t) the aggregate (net) nominal profits of firms.
The first order conditions for the household problem are (with respect to ℓ, m, c, c k,i ): where λ 0 is the Lagrange multiplier of the agent budget constraint. If the money supply follows m(t) = m(0) exp (µt), then in an equilibrium and for all t : Moreover the foc for ℓ and the one for c give the output equation From the household's f.o.c. of c k,i (t) and ℓ(t) we can derive the demand for product i of firm k, given by: In the impulse response analysis of Section 5 we assume µ = 0, τ ℓ = 0, and that the initial value of m(0) is such that m(0)/P (0), computed using the invariant distribution of prices charged by firms, is different from its steady state value.
The nominal profit of a firm k from selling product i at price P k,i , given the demand shock is A k,i , marginal cost is Z k,i , nominal wages are W and aggregate consumption c, is (we omit the time index): or, collecting W Z k,i and using that A k,i Z 1−η k,i = 1, gives so that the nominal profits of firm k from selling product i with a price gap p k,i is where we rewrite the actual markup in terms of the price gap p k,i , defined in equation (22), i.e. P k,i W Z k,i = e p k,i η−1 η . This shows that the price gap p k,i is sufficient to summarize the value of profits for product i. Note also that, by simple algebra, Π(p k,i )/Π(0) = e −η p k,i [1 + η e p k,i − η], which we use below.
Next we show that the ideal price index P (t), i.e. the price of one unit of the composite good, can be fully characterized in terms of the price gaps. Using the definition of total expenditure (omitting time index) P c = 1 0 n i=1 (P k,i c k,i ) dk, replacing c k,i from equation (53), and using the first order condition with respect to c to substitute for the c −ǫ term, gives which is the usual expression for the ideal price index, and can be written in terms of the price gaps using P k,i W Z k,i = e p k,i η−1 η .

B.1 The firm problem
We assume that if firm k adjusts any of its n nominal prices at time t it must pay a fixed cost equal to ψ ℓ units of labor. We express these units of labor as a fraction ψ of the steady state frictionless profits from selling one of the n products, i.e. the dollar amount that has to be paid in the event of a price adjustment at t is ψ ℓ W (t) = ψ W (t)c 1−ηǫ Π(0). To simplify notation, we omit the firm index k in what follows, and denote by p the vector of price gaps and by p i its i − th component. The time 0 problem of a firm selling n products that starts with a price gap vector p is to choose {τ j , ∆p i (τ j )} ∞ j=1 to minimize the negative of the expected discounted (nominal) profits net of the menu cost (the signs are chosen so that the value function is comparable to the loss function in equation (1)): (0), and using that equilibrium wages are constant W (t)/W = e δ , and using the parameterization of fixed cost in terms of steady state profits: ψ ℓ = ψc 1−ηǫ Π(0) gives (where bars denote steady state values): s.t. equation (2), ∆p i (τ j ) ≡ lim t↓τ j p i (t) − lim t↑τ j p i (t) for all i ≤ n and j ≥ 0, and where Expanding S(c, p i ) around c =c, p i = 0 and using that: into equation (55), we obtain: 6. The second derivatives of V are bounded in a neighborhood of ∂I, 7. the stopping times τ * i that achieve the solution are finite, 8. Let τ * be the optimal stopping time starting from p(0), the family {e −rτ V (p(τ )); τ ≤ τ * } is uniformly integrable for all p(0).
For completeness we state the definition of a Lipschitz surface.

D Numerical accuracy of the approximation
In this section we present some evidence on the numerical accuracy of the approximation. We compare the value ofȳ obtained from the quadratic approximation to v described above, with what we call the "exact" solution, which is the numerical solution using up to 30 terms for β i in its the expansion. The approximation are closer for smaller values of σ and ψ, which we regard as more realistic. The next figure shows the value of N a (n) for various n when the menu cost are constant returns to scale, so ψ = ψ 1 n using the approximation and using the "exact" expression.

E Sensitivity to inflation
In this section we analyze the effect of inflation on the frequency of price adjustments and the size distribution of price changes under the assumption that the inflation rate, which we denote by µ, is small. We model inflation as introducing a constant common drift on each of the n target prices {p * i (t)}. Equivalently, this means that each of the price gaps {p i (t)} has a negative drift µ, so equation (2) becomes p i (t) = −µt + σW i (t) + j:τ j <t ∆p i (τ j ) for all t ≥ 0 and i = 1, 2, ..., n.
We were not able to characterize the solution of the problem for arbitrary values of µ.
Recall that in the case of µ = 0 the time t conditional distribution of ||p(t + ∆t)|| 2 at time t + ∆t depends only on ||p(t)|| 2 . Thus, since in the objective function is proportional to y(t) ≡ ||p(t)|| 2 , the state of the problem can be taken to be the scalar y(t), and hence the shape of the control and inaction regions are all functions of y(t). In the case of µ = 0 the time t conditional distribution of ||p(t + ∆t)|| 2 at time t + ∆t depends on ||p(t)|| 2 as well as on µ [ n i=1 p i (t)]. Thus the state of the problem will not be solely y, and hence the control and inaction sets will not be functions exclusively of y = ||p|| 2 . Also in the case of µ = 0 it will not be the case that at a time τ where the firm adjust prices: ∆p i (τ ) = −p i (τ ). In other words, conditional on an adjustments, firms will not set the price gap equal to the static optimal value, since the state has a drift. Yet, even though we have not solved the model for positive inflation, the next proposition shows that for many statistics inflation has a second order effect.
For the next proposition we explicitly write µ as an argument of the value function V (p, µ), and of the statistics such as the frequency of price changes N a (µ), the hazard rate of price changes h(t, µ), the moments of the distribution of price changes E[∆p i , µ], etc. We also define the density of the marginal distribution of the absolute value of price changes ℓ(|∆p − i|, µ) and the average value function: E[V ](µ), i.e. the expected value of the value function under the invariant distribution of the price gaps g(p) as E[V ](µ) ≡ R n V (p, µ)g(p, µ)dp. We have: Proposition 14. Assume that all the functions below are differentiable. Then (i) ∂ ∂µ N a (µ)| µ=0 = 0, and ∂ ∂µ h (t, µ)| µ=0 = 0 for all t ≥ 0 , = 0, for k = 1, 2, ..., (iv) ∂ ∂µ ℓ (|∆p i |, µ)| µ=0 = 0 for all |∆p i | < √ȳ and Part (i) shows that the average number of adjustments per unit of time, N a (µ), is insensitive to inflation at µ = 0. Indeed, the whole hazard rate function of price adjustment, h(t, µ) is insensitive to inflation at µ = 0. Part (ii) states that the expected value of price changes increases linearly with µ with slope 1/N a (0), at least for small values of µ = 0. This follows from (i) and from the identity: µ = n a (µ) E [∆p i , µ], i.e. that the product of the average price change times the number of adjustments equals the inflation rate.
The result that the "intensive" margin of price adjustment is insensitive to inflation at µ = 0 applies to the special case of models with only one product, i.e. n = 1, as it is illustrated in the numerical results reported in Figure 3 of Golosov and Lucas (2007), when σ > 0. The proof of each of these results, as well as of the other parts of this proposition, is based on the symmetry of the problem. For instance it is easy to see that given the symmetry of the objective function and the distribution of the BMs {W i (t)}, then for all p ∈ R n and µ ∈ R we have: V (p, µ) = V (−p, −µ) and that p ∈ I(µ) if and only if −p ∈ I(−µ), where I(µ) is the control set, viewed as a correspondence of inflation. This implies that N a (µ) = N a (−µ), as well as h(t, µ) = h(t, −µ). Thus, if N a (µ) is differentiable at µ = 0, then it must be flat. We skip a proof of the symmetry and of the proposition, since it follows the same lines than the proof for the analogous results in the model with n = 1 but with observation and menu cost in Alvarez, Lippi, and Paciello (2011b). 24 The theoretical result about the insensitivity of N a -and the associated linearity of E[∆p]is supported by the evidence in Gagnon (2009) who, among others, finds that when inflation is low (say below 10-15%), the frequency of price changes is almost unrelated to inflation, and that the average magnitude of price changes has a tight linear relationship with inflation.
To understand (iii) and (iv) it is useful to realize that for µ = 0 the marginal distribution of price changes is symmetric around zero, a consequence of the symmetry of the loss function and of the distribution of the shocks. Part (iii) shows that all the even centered moments are approximately the same for zero and low inflation. Importantly, this includes the variance and the kurtosis which is one of the moments that researchers have focused in the analysis of the effect of multi-products firms. Yet we are pretty sure that inflation will have a first order effect on other aspects of the distribution of price changes such as skewness. Part (iv) shows that the whole distribution of the absolute value of price changes is approximately the same for low and zero inflation. Finally part (v) shows that inflation has only a second order effect on the expected value function. Equivalently, inflation causes a second order increase in the unconditional expectation of losses for the firm.
These results show that the expected losses of the firm as well as the frequency and several moments of the size distribution of price changes are insensitive to inflation at µ = 0. Thus, the analysis of the problem in a low inflation environment is well approximated by studying the case of zero inflation.
As a benchmark, note that in a flexible price economy all firms change prices, i.e. Φ n (δ) = 1 and hence the average price change equals the monetary impulse, i.e. Θ n (δ) = δ for all δ. Part (i) of the lemma states that, for large shocks i.e. for δ ≥ 2 ȳ/n, all the firms adjust prices change on average by δ, so that the economy behaves like one with no frictions.
Part (ii) illustrates a convenient homogeneity property of the Φ n , Θ n functions: after normalizing the monetary shock in terms of the price gap, these functions have only one argument. Part (iii) states that the fraction of adjusters and the response of the aggregate price level are increasing in the size of the shock.