Practical Multi-Key Homomorphic Encryption for More Flexible and Efficient Secure Federated Average Aggregation

In this work, we introduce a lightweight communication-efficient multi-key approach suitable for the Federated Averaging rule. By combining secret-key RLWE-based HE, additive secret sharing and PRFs, we reduce approximately by a half the communication cost per party when compared to the usual public-key instantiations, while keeping practical homomorphic aggregation performances. Additionally, for LWE-based instantiations, our approach reduces the communication cost per party from quadratic to linear in terms of the lattice dimension.


I. INTRODUCTION
As a protocol for training neural networks (NNs) without explicit sharing of learning data, Federated Learning (FL) has received a lot of attention since its inception around 2017 [1]. In a nutshell, starting from an initial common NN model, the FL protocol iteratively builds a global model by having the training data owners (i.e., the clients) locally updating the model by the partial execution of a training algorithm, and then, letting a central server aggregating these updates to generate the common model for the next round. FL can be instantiated in the cross-device setting, where a model is built from the data of many intermittently available and computationally constrained devices, or cross-silo, in which a model is built from the training sets of a reduced number of servers which are always available and computationally powerful. This paper focuses primarily on the latter of these two settings.
Federated Learning was initially proposed as a solution for avoiding the prohibitive communication cost of getting training data out of many user devices as well as for ensuring training data privacy. However, it is now well-known that the baseline FL protocol is not sufficient for guaranteeing the privacy of a client's training data, as the NN parameters updates exchanged throughout the protocol (seen by both the aggregation server and the other clients) leak a lot of information. As a consequence, in recent years, FL has been more deeply investigated with respect to training data privacy.
In this context, performing updates aggregation by means of Homomorphic Encryption (HE) has been investigated from the viewpoint of countering the confidentiality threats from the server on the clients' training data. Yet, from an HE perspective, previous works (e.g. [2], [3]) have focused primarily on performance issues and implicitly assumed overly simple deployment scenarios e.g. with all the encrypteddomain calculations performed under the same HE keys and all clients sharing the same decryption key in a honest-butcurious setting. In this paper, we introduce a lightweight communication-efficient multi-key approach suitable for the Federated Averaging rule, allowing each client to use its own key for encryption at each round and the effective subset of clients which participated in a round to collectively decrypt the aggregated updates to further proceed with the next protocol iteration.

A. Main Contributions
Our proposed aggregation method is secure in the semihonest setting and works under the Common Reference String (CRS) model. By combining secret-key RLWE-based HE (Ring Learning with Errors) [4], [5] and PRFs (Pseudorandom Function Family), we reduce approximately by a half the communication cost per party when comparing with its publickey counterpart. This improvement is more significant for LWE-based instantiations (Learning with Errors) [6], [7], in which thanks to the removal of the mask component for secret-key LWE samples, the communication cost per party is reduced from quadratic into linear in terms of the lattice dimension n.
A high-level comparison among different available aggregation methods and ours is included in Table I for a FL training of N AggRounds rounds with L participants. In this table, we refer by additive secret sharing to the execution in each aggregation round of the baseline protocol described in Section II-A, which allows to generate additive secret shares of zero to mask the local model updates.
Each entity E i:i∈ 1,L tries to gather as much information as possible, but do not deviate from the protocol P (i.e., E i:i∈ 1,L will try to recover information about the secrets s j:j̸ =i of other entities). Then we say that P is secure in the semihonest model if each E i:i∈ 1,L has no other information than F(s 1 , . . . , s L ) at the end of the protocol. Note that assuming semi-honest adversaries in P does not guarantee that no parties will collude [8].
In this work, we provide a solution for secure aggregation in FL with a semi-honest server (and up to L − 1 semi-honest Data Owners if paired with differential privacy techniques).
First, we assume a CRS model, where all Data Owners (DOs) have access to the same PRF. Using PRF K (T ), for the T -th encryption round and with the same secret uniformly random seed K, ensures that all DOs will be able to generate the same common mask a for their distinct RLWE/LWE samples. It is worth mentioning that, as the input T is increased after each call to PRF K (·), the same "a" value will never be used more than once to encrypt the local model udpates.
Second, we assume that DOs will have distinct secret keys. That is, each DO will encrypt her own data m i with her own secret key s i (but using the same mask a per encryption round which was shared with other DOs). Finally, during the aggregation, the semi-honest server will compute the encrypted sum i m i with the aggregated secret key i s i .
For a more realistic FL setting, our secure aggregation scheme can be seamlessly coupled with differential privacy techniques, as in [2], to cover threats coming from L − 1 colluding semi-honest DOs (out of L) that aim at gathering information about the remaining DO data.

II. BUILDING BLOCKS A. Additive Secret Shares of Zero
Given L Data Owners (DOs), we can generate L uniformly random additive shares satisfying that their addition is equal to zero. The protocol is as follows: 1) The i-th DO (∀i) generates a set of (L − 1) uniformly random elements r i,j for all j ̸ = i. Next, the i-th DO computes r i,i = −( j:j̸ =i r i,j ). All r i,j satisfy the relation j r i,j = 0.
2) The i-th DO (∀i) sends, r i,j to the j-th party, ∀j. 3

B. Rounding polynomial elements
Let ⌊a⌉ p be the scaling and rounding of each coefficient of a ∈ R N q to its nearest integer, where R q denotes the quotient polynomial ring Z q [x]/(x n + 1).
This lemma is used in our scheme (see Section III) to remove the error term associated to each encryption. Given (a, b = as + e + q/p · m), we compute ⌊b⌉ p = ⌊as + e⌉ p + m which, by Lemma 1, is different to ⌊as⌉ p + m with a certain probability Pr(Ev). The upper bound of the probability Pr(Ev) depends inversely on q.

C. Distributed Decryption
Given (a, b = as + e) ∈ R 2 q s.t. s = L i=1 s i where all s i ∈ R q , applying modulus switching [10] from q into p, we get ⌊a⌉ p , ⌊b⌉ p = ⌊a⌉ p s + (p/q · a − ⌊a⌉ p ) · s + p/q · e .
By applying Lemma 1, the error term e is removed with a certain probability, finally having: (1) From equation (1), we can obtain the magnitude of the difference e distributed = ⌊as⌉ p − i ⌊as i ⌉ p . This term must be removed for the correctness of the distributed decryption protocol executed after each aggregation round. Assuming that each s i is bounded by B, and due to ∥e a ∥ ∞ < 1/2, the magnitude of this remaining error term is bounded by nLB.

III. PROPOSED PROTOCOL FOR SECURE AGGREGATION
Current works making use of Threshold RLWE-based HE [11], [12] define a collaborative key setup phase to generate a joint public key pk associated to several secret keys s i . This results in a pair (sk = s, pk = (a, as + e)), where each i-th DO has a s i s.t., We optimize this primitive for the case of secure federated average aggregation: by assuming the CRS model, ciphertexts can be aggregated on-the-fly, similarly to real "multi-key" HE schemes. We include next a high-level description of our proposed secure aggregation primitive.

A. High-level description
In the CRS model, each party (a.k.a Data Owner, DO) has access to a common uniformly random polynomial term a per round. Additionally, we assume that all DOs have run the protocol described in Section II to generate uniformly random polynomial shares. As a consequence, each i-th DO holds share i = r (i) . Figure 1 gives a high-level description of the required steps for our protocol: (1) DOs encrypt their inputs, (2) the aggregator homomorphically aggregates the encrypted updates, and (3) DOs collaboratively decrypt the aggregated update. In particular, each secure aggregation round is as follows: 1) DOs encrypt their inputs: The i-th DO (∀i) encrypts its model update m i with its secret key s i as (a, b i ) = (a, a(s i + r (i) ) + e i + q/p · m i ), which can be compressed by a half by only sending b i because a is publicly known (i.e., computable with PRF K (T ) for the T -th round).
2) Aggregation step: After receiving all b i polynomial terms, a semi-honest aggregator can directly compute: which corresponds to Enc(sk = s, m), the desired encrypted aggregation. Finally, the aggregator sends back share (agg) = ⌊b⌉ p ′ to the DOs. which is equal to m with probability higher than 1 − 2 −κ , whenever the encryption parameters are chosen according to Section IV.

B. Some remarks to manage several ciphertexts per round
For simplicity of exposition, we assume in Section III-A that the model updates m i coming from the i-th DO fit inside of only one secret-key ciphertext (a, b i ). If this is not true, and several ciphertexts are needed to encrypt m i , then the PRF K (T ) can also be used to generate a set of N Ctxts different a terms per each round, i.e., {a 1 , . . . , a N Ctxts }. Finally, the i-th DO would send its N Ctxts secret-key ciphertexts to the aggregator {(a 1 , b i,1 ), . . . , (a N Ctxts , b i,N Ctxts )}. Here, all {a 1 , . . . , a N Ctxts } can be removed because only the b terms are needed for running the aggregation step.

C. Security discussion
In our proposed protocol, DOs make repeated calls to PRF K (T ) as a means to generate all the a polynomial terms which are needed for encryption. Here, K is a uniformly random seed and the same T is not used more than once during the whole protocol execution. Therefore, by emulating a random oracle with the PRF K (T ) function, all the generated polynomials a are computationally indistinguishable from a set of independent and uniformly random polynomials.
Following this line of reasoning, given a pair of independent and uniformly random terms a, u ← R q , then if an algorithm A(a, ⌊u⌉ p ′ , ⌊as i ⌉ p ′ ) can distinguish between (a, ⌊u⌉ p ′ ) and (a, ⌊as i ⌉ p ′ ), A can be used to distinguish with probability 1 − 2 −κ the RLWE sample (a, as i + e) from the pair (a, u).
Consequently, the security of our secure aggregation protocol relies on the difficulty of breaking the RLWE indistinguishability assumption. Alternatively, the security of the protocol could also rely on the difficulty of breaking the LWE assumption if ciphertexts are defined under the LWE problem. We refer the reader to Section III-D for more details.

Secure Federated Aggregation
Step 1: DOs encrypt and send their inputs Step 3: DOs collaboratively decrypt Step 2: Homomorphic aggregation Fig. 1. High-level description of the workflow for our secure aggregation protocol.

D. From RLWE to M-LWE and LWE
As the a polynomials in the RLWE samples (a, b = as + e) are generated under the CRS model with a PRF K (·), keys could be alternatively defined under either M-LWE (Module Learning with Errors) [13], [14] or LWE assumptions without adding extra communication/computation costs for aggregation. On the one hand, we can work under the LWE assumption with the same communication cost as its RLWE counterpart, and hence removing the quadratic communication/computation overhead of public-key LWE-based solutions. On the other hand, there is an overhead for encryption and also an increase in the number of calls to PRF K (·) by a factor n. Table II includes the communication cost per party of the secure aggregation protocol. We assume that the number of model parameters N ModelParam is high enough.

B. Protocol parameters {p, p ′ , q, n}
If the event Ev represents the probability of having at least a decryption failure during N AggRounds consecutive rounds, 1 then by applying Lemma 1, we have: in which bounding by Pr(Ev) ≤ 2 −κ with parameter κ, we have that q satisfies: Finally, a last rounding step is applied after aggregating the shares, which requires L . This gives the following lower bound for q:

C. Example of parameters for Federated Learning (FL)
Table III includes two different sets of protocol parameters based on the ones provided in [2] for training in an FL context. To fix ideas in terms of performance costs, on the FEMNIST dataset [15], [16] with a 486,654 parameters model and 1000 clients, we obtain (HE-domain) aggregation times of around 27 secs for an overall time per learning round of around 10 mins (i.e., including the local training done on the clients), hence a ≈ 5% overhead. This is following other studies [2] which are using parameters similar to those in Table III

D. Session keys
It is easy to define session keys, as the s i terms of s i + r (i) can be changed in each aggregation round. Alternatively, other options are possible, e.g., by using s i + u · r (i) , where u is a uniformly random element changed each round and generated by the PRF K (·).

E. Flexible decryption structure
If a DO does not collaborate for decryption, the aggregator and the rest of DOs are able to "fix" their encryptions with an extra communication round, enabling: (1) to remove the model update of the missing party in the aggregation result, and (2) to decrypt under a different subset of secret keys.

F. General Linear Combination of Model Parameter Updates
For simplicity of exposition, we have only exemplified our protocol in Section III for the case of addition. However, our results for aggregation can be easily generalized to work for any linear combination of encrypted model updates; i.e., m = i λ i m i . For this purpose, there are available several possibilities. For example, either (a) the polynomial terms a are fixed, which implies that DOs must use λ i s i instead of s i during decryption, and also now their generated additive shares in Section II-A have to satisfy the zero equality for the desired linear combination, or (b) each DO uses λ −1 i a instead of a during encryption, for which DOs can maintain the use of s i for decryption. For both described possibilities, the aggregator computes b = i λ i b i instead of i b i .

V. CONCLUSIONS AND FUTURE WORK
This work presents a lightweight aggregation protocol for the federated learning under the assumption of semi-honest parties, with less bandwidth requirements than existing protocols and a more flexible setup. In the future, we intend to implement and test this secure aggregation approach when deployed for a practical use case of Federated Learning (such as the one from [3]).
Moreover, we can go beyond the assumption of honest-but-curious data owners and aggregator by extending the protocol with methods for verifiable encryption/aggregation/decryption [17].