ENKI: Access Control for Encrypted Query Processing

A data owner outsourcing the database of a multi user application wants to prevent information leaks caused by outside attackers exploiting software vulnerabilities or by curious personnel. Query processing over encrypted data solves this problem for a single user, but provides only limited functionality in the face of access restrictions for multiple users and keys. ENKI is a system for securely executing queries over sensitive, access restricted data on an outsourced database. It introduces an encryption based access control model and techniques for query execution over encrypted, access restricted data on the database with only a few cases requiring computations on the client. A prototype of ENKI supports all queries seen in three real world use cases and executes queries from TPC-C benchmark with a modest overhead compared to the single user mode.


INTRODUCTION
Outsourcing an application's database backend offers efficient resource management and low maintenance costs, but exposes outsourced data to a service provider. To ensure data confidentiality, data owners have to prevent unauthorized access while the data is stored or processed. Storing data on an untrusted database requires protection measures against curious personnel working for the service provider or outside attackers exploiting software vulnerabilities on the database server. In addition, data owners also have to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. control data access for their own personnel. An emerging solution to the problem of untrusted databases is encrypted query processing [28,16,8,1,2,6,30,5,25,24] where queries are executed on encrypted data. To grant or restrict shared data access to personnel processing unencrypted query results, data owners have to implement additional fine-grained access control mechanisms. Implementing such a multi user mode using encrypted query processing for a single user operating with one key [28,16,8,1,2,6,5,25] combined with an additional authorization step at the application server like [26] can be compromised: Assume that a user working for the data owner and a service provider's employee collude. If the user knows the decryption key of the data and the employee provides the encrypted data stored in the database, they are able to decrypt all data bypassing the access control mechanisms. This paper presents the design, implementation, and evaluation of ENKI, a system that securely processes relational operations over encrypted, access restricted relations. Its approach is to encrypt data with different access rights using different encryption keys. Further, it introduces techniques to handle query processing over data encrypted with multiple encryption keys. ENKI builds on previous work in encrypted query processing for a single user as described in [16,8,1,6,25], but ENKI is the first system that efficiently supports queries over data encrypted with different keys. Existing approaches only support query processing with multiple keys for searchable encryption which allows to check if an encrypted value matches a token [30,24,3] or if there is no shared data [25]. The support of query processing over access controlled encrypted data presents two major challenges: The first challenge is the mapping of any complex access control structure required in a multi user scenario to an encryption enforced access control model which still allows query execution. Previous works only focus on the access control mechanism [9] or the key management [4,7]. The second challenge is to efficiently execute a range of queries while minimizing the revealed information on the server and the amount of computations on the client. Current approaches for multiple users offer either limited functionality [30,25,24,3] or expose confidential information to the database [21]. We tackle these challenges using two ideas: First, we introduce a new model for encryption based access control in Section 3 which defines access control restrictions on the level of attribute values and applies encryption as a relational operation to enforce the access restrictions on a relational algebra. Second, we present three different techniques to support the execution of relational operations in multi user mode. The first technique is query rewriting to adapt relational operations over data encrypted with different keys described in Subsection 4.1. The next technique We have implemented ENKI for a SAP HANA database extending HANA's JDBC driver and a client application. Our solution supports most relational operations and aggregation functions. The evaluation of different query types seen in three use cases and in the TPC-C benchmark shows that this range is suitable for real world applications. Our performance evaluation shows that ENKI consumes an average overhead of 36.98% (which is a time penalty of 0.6181 ms on average) for the query execution of queries seen in the TPC-C benchmark in a two user scenario compared to the single user mode and that the overhead increases modestly in a more complex scenario.

OVERVIEW
Problem Statement. Consider two users, Alice and Bob, who share access to a database with two tables R and S. Assume that Alice has private access to one tuple and shares access to other tuples with Bob in both tables R and S respectively. We encrypt tuples of table R that are only accessible for Alice with key r_a and tuples of table S that are only accessible for Alice with key s_a. Tuples of table R that are accessible for both Alice and Bob are encrypted with key r_ab and of table S with key s_ab. Alice knows the keys r_a, r_ab, s_a, and s_ab and Bob knows the keys r_ab and s_ab. Assume Alice issues an equal join on tables R and S. Therefore, the database executes a cartesian product on all tuples of R and S that Alice is allowed to access and proxy re-encrypts these tuples to check the equal condition. Current proxy re-encryption protocols for deterministic encryption schemes [25], [22] cannot adhere the access restrictions while applied to the tuples: they reveal private information. To illustrate the problem, consider the proxy re-encryption of keys r_a and r_ab to a new key r_c denoted as r_a ∼ r_c and r_ab ∼ r_c. Existing protocols are symmetric and transitive such that a proxy re-encryption r_a ∼ r_c ∼ r_ab exists. Therefore, Bob can proxy re-encrypt all data encrypted with Alice's key r_a to their shared key r_ab. This circumvents the defined access restrictions as the proxy re-encryption reveals information exclusively accessible by Alice. Architecture. Figure 1 shows ENKI's overall architecture and the involved entities. These involved entities are the data owner who also maintains the application and the service provider who operates the database. There are also different users (denoted in Figure  1 as User A, User B, User C) which are personnel working for the data owner. A user accesses the database with a client which issues queries via JDBC driver to the database backend. We extended the JDBC driver with the ENKI Query Adapter to rewrite an incoming query with minimal effort to be processable in the multi user mode. We also modified the clients to post-process the returned query results. The execution of a rewritten query on a database containing encrypted tuples requires that the predicates of this query must be encrypted, too. Based on access policies, users are acquainted with the necessary encryption keys. These keys are stored encrypted in a key store. If a user logs in, she hands over her masterkey to decrypt her encryption keys stored in the key store. Using the stored and decrypted encryption keys, the ENKI Query Adapter encrypts the rewritten query. The database management system (DBMS) receives the rewritten, encrypted query and executes it on the encrypted database. The encrypted query result is returned to the JDBC driver where it is decrypted by the ENKI Query Adapter before it is post-processed on the client. Note that keys stored in the key store cannot be decrypted if their respective users are logged out. DBMS and database stay unmodified. User-defined functions (UDFs) perform cryptographic operations like our new privacy-preserving proxy re-encryption introduced in Subsection 4.2.
Threat. Our threat model assumes that an attacker has compromised application and database server. The attacks are depicted with flashes in Figure 1. We assume that the attacker is passive: she can read all information stored on the database, but does not manipulate the stored data or issued queries. The attacker learns the encryption keys of all users logged in at the time of the attack. Acquainted with their masterkeys and their encryption keys, the attacker is able to read all data of these compromised users stored on the database. In particular, the attacker can access data shared with other uncompromised users. ENKI offers confidentiality guarantees for non-compromised users during such an attack: the attacker cannot learn their private data i.e. data which are not shared with compromised users. ENKI provides this security guarantees in the face of a passive attacker. An active attacker which alters or deletes information stored on the database is out of scope for ENKI. We argue that such manipulations might be easier to detect than a passive attack. ENKI does also not prevent attacks on client machines that lead to the compromise of keys.

ENCRYPTION-BASED ACCESS CON-TROL ON A RELATIONAL ALGEBRA
This section presents a new model how to specify access rights on attribute values of relations and how to enforce them cryptographically.

Access Restrictions on Relations
We define access restrictions on attribute values of a relation using an access control matrix. Note that an access control matrix may serve as a base for more enhanced access models exploiting role-based access control [27]. Let A be an access control matrix where the rows correspond to subjects s and the columns correspond to objects o. Figure 2 illustrates an access control matrix with two users, Alice and Bob, as subjects and a relation R containing five tuples, t1, . . . , t5, as objects. We denote S as the set of all subjects with |S| = n and O as the set of all objects. A data owner grants access for an object o to a subject s by setting the entry in the access matrix A[s, o] to 1. This enables the user to read, update, or delete the object. If no access is granted, the entry is set to 0. Our approach does not support the implementation of different types of access rights e.g. read only or read-write. A column of an access control matrix is a representation of the set of subjects which has access to an object o. We denote this as the qualified set QSo of object o. We assume that each object can be accessed by at least one subject such that there are no zero columns and no empty qualified sets. Consider QSt 4 = {1, 1} in Figure 2. This is the qualified set of object t4 denoting that user Alice and user Bob have access to tuple t4. We further name P * (S) the power set of all subjects S without the empty set. We denote each of these subsets as pi ∈ P * (S) for all i = 1, . . . , 2 n − 1 and call it a user group. From the access control matrix depicted in Figure 2, we derive three user groups: We define a mapping which assigns each user to the user groups she participates in. For each user s, there is a set of pj with j = {1, . . . , 2 n − 1} of all user groups the user participates in. This mapping is called user group mapping and can be stored as a relation with the attributes user and user group. Figure 3 shows the user group mapping for our example. It has two attributes: User and User Group. It shows that user Alice is member of user groups A and AB and user Bob is member of user groups B and AB.  A qualified set of an object maps to one and only one user group which contains the same set of subjects. We group all objects accessible by the same user group and call this an object set. It is defined as for a user group pi ∈ P * (S). This is the set of all objects assigned to the same user group. In Table 2 Note that all O(pi) form a partition over O as each two object sets are pairwise disjoint and the union of all object sets (which are nonempty by definition) is equal to the set of all objects O. We use this resulting partition to divide the underlying relation. This is to store each object set in a separate relation which we call virtual relation. A virtual relation indicates that one user group can access all of its tuples. This saves the annotation of a tuple with access information as its insertion in a virtual relation implies that this tuple can be accessed by a certain user group. For n users, each relation is partitioned in a maximum of 2 n − 1 virtual relations. The total number of tuples does not change as each tuple of a relation is stored in one and only one virtual relation. We define a mapping which assigns each user group and relation to the virtual relation containing the tuples this user group is granted access to. This mapping is called virtual relation mapping and can be stored as a relation with the attributes User Group, Relation, and Virtual Relation. Figure 3 shows the virtual relation mapping for the user groups A, B, and AB and the relation R. The pair user group A and relation R is mapped to the virtual relation RA, the pair user group B and relation R is mapped to the virtual relation RB, and the pair of user group AB and relation R is mapped to the virtual relation RAB. The data owner specifies and maintains the user group mapping and the virtual relation mapping.

Encryption as Relational Operation
We now define encryption as a relational operation and show how to enforce the previously defined access restrictions.
The encryption of relation R = R(A1, . . . , An) is the encryption of its attributes A1, . . . , An and their attribute values t1 k , . . . , tn k for all k = 1, . . . , j as κz(R) :=R(κz(A1), . . . , κz(An)) ={κz(t1 k ), . . . , κz(tn k )|ti k ∈ Ai for all i = 1, . . . , n and for all k = 1, . . . , j}. ( We apply adjustable query-based encryption introduced in [25] to efficiently support the execution of relational operations over encrypted data. In the following sections, we only refer to an encryption scheme as relational operation κz with key z, but omit the details of the adjustable encryption and the different encryption keys of each attribute value. We now use encryption to enforce access restrictions on the attribute values of a relation. Consider a relation R = R(A1, . . . , An) and three user groups A, B, and AB. The data owner splits R into virtual relations RA, RB, and RAB. These virtual relations adopt the same schema as relation R and contain the tuples accessible for the respective user group. The data owner generates encryption keys for each user group and encrypts the respective virtual relation with its key. She generates key r_a for user group A and encrypts RA as She also generates keys r_b and r_ab and encrypts RB and RAB for user groups B and AB accordingly. The data owner issues the respective keys to the member of each user group. The number of keys distributed to a user depends on the number of user groups a user participates in. The maximum number of user groups is 2 n − 1 which is the superset of the number of users n without the empty set. In real world applications, the total number of user groups is even smaller: its order is bound by the number of users n [19].

QUERY PROCESSING OVER AN EN-CRYPTED RELATIONAL ALGEBRA
Encrypted query processing over a relational operation can be efficiently supported for single user mode [25]. In particular, this holds for the following primitive and derived relational operations: selection, projection, rename, cartesian product, set union, set difference, equi join, and the aggregate functions group by, count (distinct), sum, average, maximum, minimum, and sort. The introduction of access restrictions on relations interferes with encrypted query processing in three ways: First, a relational operation is now executed on (potentially) multiple virtual relations depending on the access rights of the user rather than on one relation. To tackle this problem, we introduce query rewriting strategies in Subsection 4.1. Applying these rewriting strategies does not change the application logic: we point out that the user only has to submit the original, unchanged query and its user id. The ENKI Query Adapter rewrites the query and adapts it for encrypted query processing. Second, proxy re-encryption of virtual relations is necessary to support count distinct, equi-join, or set difference operations on the server. We present a new privacy-preserving encryption scheme to support proxy re-encryption in a multi user setting in Subsection 4.2. It offers proxy re-encryption of attributes or relations encrypted with different keys while preserving the access rights.
Third, some relational operations can only be executed on serverside with significant computational effort, huge storage capacities, or diminishing security. For such cases, we present a client-server split requiring only small data traffic and minimal computational effort on the client while preserving confidentiality in Subsection 4.3. All introduced techniques are combined in a multi user algorithm to handle the presence of multiple users described in Subsection 4.4. The multi user algorithm takes as input a user id and a query consisting of a combination of relational operations, processes it over virtual relations, and returns the result. It can handle an arbitrary set of users. To explain the three techniques, we use the small example introduced in Section 3: we part table R in virtual relations RA, RB, and RAB and encrypt them as κr_a(RA), κ r_b (RB), and κ r_ab (RAB) with keys r_a, r_b, r_ab. Table S is treated accordingly.

Rewriting Strategies
We introduce rewriting strategies for the relational operations selection, projection, rename, aggregate function count, set union, and cartesian product over encrypted virtual relations. Applying the rewriting strategies allows the straightforward execution of these relational operations over encrypted data. The rest of this subsection presents the rewriting of these relational operations in detail. Selection. Consider a predicate θ (e.g. =, <, ≤, >, ≥) and α, β attributes, constants, or terms of attributes, constants, and data operations. A selection σ αθβ (R) on relation R issued by user Alice is executed on the encrypted virtual relations κr_a(RA) and κ r_ab (RAB). The condition αθβ has to be applied on both virtual relations. Therefore, αθβ is encrypted with key r_a as κr_a(α)θκr_a(β) and with key r_ab as κ r_ab (α)θκ r_ab (β). It is Projection. Let R ′ be a relation with and R ′ A and R ′ AB the respective virtual relations. A projection π β (R) with attribute list on relation R issued by user Alice is executed on the encrypted virtual relations κr_a(RA) and κ r_ab (RAB). Therefore, the attribute list β is encrypted with key r_a as and also encrypted with key r_ab as It is Rename. A rename ρ of an attribute Ai ∈ R to Q issued by Alice is executed on the encrypted virtual relations κr_a(RA) and κ r_ab (RAB). The new attribute name Q is encrypted with key r_a as κr_a(Q) and with key r_ab as κ r_ab (Q) respectively. It replaces the encrypted original attribute name Ai in the virtual relations. It is A rename is not persisted. Count. The aggregate function β γ Count(A i ) (R) on a relation R issued by Alice is executed on the encrypted virtual relations κr_a(RA) and κ r_ab (RAB). It is with Count the aggregate function executed on server-side. It counts the numbers of attribute values of Ai for the virtual relations RA and RAB separately and adds these partial results on the server. The output represents the number of attribute values of attribute Ai accessible by Alice. Set Union. Let relations R and S have the same set of attributes. A set union R∪S issued by Alice is executed on the encrypted virtual relations κr_a(RA), κ r_ab (RAB), κs_a(SA), and κ r_ab (SAB). It is Cartesian Product. Let r be a tuple of relation R and s be a tuple of relation S. A cartesian product R × S issued by Alice is executed on the encrypted virtual relations κr_a(RA), κ r_ab (RAB), κs_a(SA), and κ s_ab (SAB). It is ENKI also supports queries to update, delete, or insert a tuple as well as queries to modify the table schema. The introduced rewriting techniques can be directly applied to update and delete a tuple. Schema modification and insertion of a tuple require different rewriting strategies. In case a user modifies the schema of one relation, the query must be rewritten to modify the schemas of all its virtual relations. In case a user inserts a tuple in a relation, the tuple can only be inserted in one of the virtual relations the user is allowed to access. To rewrite the insert query, exactly one virtual relation must be specified. This depends on the access restrictions defined by the data owner. The computational complexity to rewrite and encrypt a query for the multi user mode depends on the number of operands seen in the original query (i.e. number of involved relations) and the number of user groups a user participates in. Assume that the computational complexity is O(s) for unary and binary operations in the single user mode. The computational complexity to rewrite and encrypt an unary operation is linear given k user groups. Each query is rewritten and encrypted to be executed on k virtual relations resulting in a computational complexity of O(s × k). The computational complexity to rewrite and encrypt a binary operation is quadratic given that a user is granted access to k virtual relations of each relation respectively. Each query is rewritten and encrypted to be executed on one of k 2 -pairs of the involved virtual relations resulting in a computational complexity of O(s × k 2 ).

Proxy Re-Encryption as Relational Operation
Virtual relations are encrypted with different encryption keys which prevent comparisons of tuples even if a deterministic encryption scheme is used. However, comparisons are necessary to compute the unary operation count distinct or the binary operations equi-join and set difference over (deterministically) encrypted data. Our goal is to proxy re-encrypt virtual relations on the database server so that all queried attributes share the same encryption key while preserving data confidentiality. To formalize this approach, we define proxy re-encryption as a relational operation.
A symmetric and transitive proxy re-encryption scheme ensures privacy-preserving computations on the database server in the single user mode [25]. However, if you recall the problem statement in Section 2, it does not preserve data confidentiality in the face of multiple users as its application leads to a data compromise. This motivates the introduction of a non-symmetric and non-transitive proxy re-encryption scheme called DETPRE as the cryptographic primitive for count distinct, equi-joins, and set differences in multi user mode.
Definition 5. A deterministic proxy re-encryption scheme is a tuple of algorithms P aramGen, KeyGen, Enc, T oken, P re such that: • Parameter Generation. The probabilistic polynomial time algorithm P aramGen takes as input the security parameter λ and outputs system parameters params: params ← P aramGen(1 λ ).
• Key Generation. The probabilistic polynomial time algorithm KeyGen takes as input the security parameter λ and outputs a key ki: ki ← KeyGen(1 λ ).
• Encryption. The deterministic polynomial time algorithm Enc takes as input a plaintext m and key ki and outputs a ciphertext: C = Enc(m, ki).
• Token. The deterministic polynomial time algorithm T oken takes as input two keys ki, kj and outputs a token to proxy re-encrypt ki to kj : T = T oken(ki, kj).
• Proxy Re-Encryption. The deterministic polynomial time algorithm P re takes as input a ciphertext C and a token T and outputs a ciphertext C ′ : C ′ = P re(C, T ).
We now present our deterministic proxy re-encryption scheme DET-PRE by specifying each algorithm. ParamGen. Given a security parameter λ, P aramGen works as follows: we generate a prime p and two groups G1, G2 of order p, and a bilinear, non-degenerated, computable map e : G1 × G1 −→ G2. We choose a generator G ∈ G1 uniformly at random. KeyGen. We choose ki ∈ Zp uniformly at random. Enc. We encrypt a plaintext m with a key ki and compute a ciphertext C as Token. We generate a token T that proxy re-encrypts a ciphertext encrypted with a key ki to be encrypted with a key kj and compute Pre. We proxy re-encrypt a ciphertext C encrypted with a key ki to a ciphertext C ′ encrypted with a key kj and compute DETPRE is single-hop meaning that a ciphertext can only be proxy re-encrypted once. This restricts its usability as the key of a once proxy re-encrypted ciphertext is persisted. Therefore, we propose the following strategy which allows to benefit from the application of DETPRE while maintaining its re-usablity: 1. We encrypt all attribute values using the algorithm Enc. These encrypted attribute values are called base values.
2. If a proxy re-encryption is required, we use the algorithm P re and proxy re-encrypt the base values with a temporary key c. The proxy re-encrypted results are called DETPRE values.
3. We store the DETPRE values temporarily as a concatenation to the base values and use them to process a relational operation.
4. After the user logs out, the DETPRE values are deleted.
We now describe our adversary model to informally explain the security guarantees our proxy re-encryption schemes provides. We consider a passive adversary, i.e., the adversary can read all encrypted attribute values of all users, but does not modify them. We assume that the adversary has also compromised the application and its database proxy and observes executed operations. In particular, if a user is compromised during the attack, the adversary learns the user's masterkey, the encryption keys stored in the key store, and the used tokens of a user. Our goal is to prevent an adversary from using this information to learn private data of noncompromised users. Therefore, we assume a number of users distributed to n user groups with each user group endowed with an encryption key d1, . . . , dn which are kept private. We allow the adversary to compromise all but one encryption key. The adversary could have learned these keys as a result of a collusion between service provider and personnel working for the data owner. Therefore, she has access to keys {d1, . . . , dn−1} but not to key dn. It implies that the adversary can decrypt all database entries encrypted with the keys d1, . . . , dn−1. In particular, if dn is the private key of a single user i.e. of a user group with only one member and this user also participates in additional user groups with compromised members, then the adversary can decrypt all tuples encrypted for these user groups but cannot decrypt the tuples encrypted with dn. The adversary can compute or learn tokens T oken(di * , di) for all compromised keys di * ∈ {d1, . . . , dn−1} to be proxy re-encrypted to an arbitrary key di. Thereby, she can proxy re-encrypt the database entries encrypted with the compromised encryption keys d1, . . . , dn−1.
The database entries encrypted with key dn are not compromised and the adversary cannot access these database entries. She also cannot compute or learn tokens T oken(dn, di) which proxy reencrypt the database entries encrypted with key dn to an arbitrary key di. However, she can compute tokens T oken(di, dn) such that a database entry encrypted with an arbitrary key di can be proxy reencrypted to a key dn. Given all these information, the adversary should not be able to proxy re-encrypt an attribute value encrypted with the encryption key dn to another key. We refer to this property as non-reversion. Next, we study a security game to formally define the described security guarantees and proof our claims based on a known hardness assumption. Let A be a probabilistic time adversary modeled as described above.
Let C be the challenger. Then consider the following security game for a security parameter λ: Setup. C takes a security parameter λ, runs algorithm P aramGen, and returns the system parameters params to A. C also runs algorithm KeyGen and outputs keys d1, . . . , dn. C sends d1, . . . , dn−1 to A and keeps dn as a secret. C runs algorithm T oken and outputs T oken(di, dj ) for all i, j = 1, . . . n that allow a database entry encrypted with the key di to be proxy re-encrypted to the key dj. C sends Token(di * , di) with i * = 1, . . . , n − 1 and i = 1, . . . , n to A.
Phase 1. A performs actions q1, . . . , qm where qi is one of the following type: Enc A chooses an arbitrary value s and runs algorithm Enc. Thereby, A encrypts s with key di * for i * = 1, . . . , n− 1. (He knows the keys d1, . . . , dn−1 of all compromised users.) Although he does not know key dn, he can also encrypt an arbitrary value with key dn as he can compute given a T oken(di * , dn) and a key di * to encrypt a chosen value s under the uncompromised key dn as G mdn .
Pre A runs algorithm P re to proxy re-encrypt a ciphertext Challenge. A chooses a key d and sends it to C. C picks a random value r and encrypts it with key dn as G rdn . It sends G rdn to A and asks him to proxy re-encrypt C = G rdn to key d as C d .
Guess. A outputs its guess C ′ d and wins the security game if and only if C d = C ′ d .
The advantage of A in the security game is defined as We proof the security of DETPRE in the Appendix. The remaining of this section presents the proxy re-encryption and rewriting strategies to execute the relational operations count distinct, set difference, and equi-join. Count Distinct The aggregate function count distinct on a relation R issued by Alice is executed on the encrypted virtual relations κr_a(RA) and κ r_ab (RAB). As these virtual relations are encrypted with different keys, it is not possible to apply a count distinct. So, we adjust the key of both virtual relations to key c. It is χc(κr_a(RA)) = RA(χc(κr_a(A1)), . . . , χc(κr_a(An))) = RA(κc(A1), . . . , κc(An)) = κc(RA) and χc(κ r_ab (RAB)) = κc(RAB) respectively. The aggregate function count distinct is then computed as and κc(RAB), κc(SA), and κc(SAB) respectively. Then, we apply the set difference on the proxy re-encrypted virtual relations. It is and κc(RAB), κc(SA), and κc(SAB) respectively. We encrypt the condition Ai = Bj as κc(Ai) = κc(Bj ) and execute the equi-join as The computational complexity to proxy re-encrypt a query for the multi user mode depends on the computational complexity of our introduced proxy re-encryption scheme and on the cardinality j of the involved k virtual relations. The computational complexity of DETPRE is the computation of one pairing operation for each of the j ×k attribute values. The computational effort increases compared to the operations applied in the single user mode as the pairing operation is more expensive [22,25]. In addition, optimization strategies can minimize the number of attribute values that have to be re-encrypted in the single user mode [17]. These are not feasible in the multi user mode.

Client-Server Split
Aggregate functions count, count distinct, group by, sum, average, maximum, minimum, and sort compute key figures over a whole relation. The encrypted processing of aggregation results is supported on the server in the single user mode [25]. In Subsection 4.1 and Subsection 4.2, we introduced the server-side execution of count and count distinct in the multi user mode. Now, we explain the execution of the rest of these aggregate functions. Introducing virtual relations to specify access restrictions, aggregate functions cannot be executed on the whole relation as this relation is split into different virtual relations encrypted with different keys. In order to evaluate an aggregate function, a user has to process the aggregate function over all virtual relations she is allowed to access. These virtual relations are encrypted with different keys. Typically, the evaluation of aggregation results requires that all invoked virtual relations are encrypted with a shared encryption key. One possible solution is a proxy re-encryption on the server to compute the aggregation results. Such proxy re-encryption schemes must be suitable for the encryption scheme required by the aggregate function. Unfortunately, some can be hard to construct [25] while others require notable computational effort and execution time [13]. Another naive solution processes the aggregate functions on the client. This generates significant data traffic and increases storage capacity as all data has to be transferred to and stored on the client. In addition, the client needs sufficient computational capacity to evaluate the aggregate function. This in mind, we opt for a client-server split where a significant amount of computational effort is executed on encrypted data and small encrypted partial result sets are issued to the client where they are decrypted and further processed to receive the final result. Therefore, we split the execution of these aggregate functions between server and client as follows: • On the server: Computation of the encrypted results for each virtual relation. These are the partial results.
• On the client: Decryption of the partial results and computation of a function FAgg which takes as input the unencrypted partial results and computes the final result depending on the underlying aggregate function.
To illustrate this approach, consider an aggregate function F (Ai) which computes maximum, minimum, average, sum, or sort over an attribute Ai. Let β = (A1, . . . , A k ) be an attribute list to group the results. If β = ∅, then there is no group-by function defined. An aggregate function β γ F (A i ) (R) on a relation R issued by Alice is executed on the encrypted virtual relations κr_a(RA) and κ r_ab (RAB). Therefore, the attribute list β is encrypted with key r_a as κr_a(β) and with key r_ab as κ r_ab (β). The function F (Ai) is also encrypted with key r_a as F (κr_a(Ai)) and with key r_ab as F (κ r_ab (Ai)). We compute the partial result for virtual relation RA on the server as κr_a(Res(RA)) = κr_a(β) γ F (κr_a(A i )) κr_a(RA) (30) and the partial result for virtual relation RAB as The partial results κr_a(Res(RA)) and κ r_ab (Res(RAB)) are sent to the client where they are decrypted. On the client, we compute the function FAgg which takes as input the unencrypted partial results. It is On the client, we process the partial results Res(RA) and Res(RAB) as follows. If RA.Ai = RAB.Ai, we merge these groups of RA and RAB and include it in the final result. If RA.Ai = RAB.Ai, we overtake the partial result in the final result. The client-server split does not increase the computational complexity of a query as this technique only distributes the computations between client and server. However, the data traffic increases the communication complexity.

Multi User Algorithm
We apply these introduced techniques and present a multi user algorithm allowing a user to execute a query over a set of access restricted relations. It takes as input a user id and an unencrypted query and returns the final result of the query as output. The user id is an identifier unique for each user. A query is a combination of relational operations over one or more relations. The final result is the decrypted result of the query. Consider a relation R with attributes A1, . . . , An and a relation S with attributes B1, . . . , Bm. The data owner splits the relation R in virtual relations R1, . . . , R k and encrypts them with keys v1, . . . , v k . She also splits the relation S in virtual relations S1, . . . , S l and encrypts them with keys w1, . . . , w l respectively. The data owner handles n user. Each user is equipped with a user id. The data owner defines the user group mapping where each user id is related to its user groups. She also defines the virtual relation mapping where each pair of user group and relation is assigned to a virtual relation. Here, we focus on a user which is member of i + j different user groups. For relation R, the user is member of user groups which are assigned to the virtual relations κv 1 (R1), . . . , κv i (Ri) and for relation S, the user is member of user groups which are assigned to the virtual relations κw 1 (S1), . . . , κw j (Sj). With respect to the specific query, the multi user algorithm requires six steps:  (R1), . . . , κv i (Ri) and κw 1 (S1), . . . , κw j (Sj).
We describe the details of this algorithm in Algorithm 1. It takes as input the query Q which can contain one or more unary or binary operations over relation R (and relation S). It returns a rewritten query sQ to be executed on the server and in some cases also a rewritten query cQ to be executed on the client. Server-side Execution. The server executes the rewritten, encrypted query sQ and returns the encrypted results to the client. If clientside processing is necessary, the server also returns a query cQ Client-side Execution. The client receives the encrypted results and decrypts them. If the client does not receive a query cQ, the query processing is finished. If the client receives a query cQ, it executes the query over the decrypted partial results receiving the final result.

KEY MANAGEMENT AND DYNAMIC ACCESS CONTROL POLICIES
ENKI enforces access policies through selective encryption leading to different keys for each user. However, access policies (and thereby keys) might change: a data owner grants access rights to new users or revokes access rights from others. Adding or delet-ing users of a user group can be formalized as changes in a user hierarchy.
Definition 6. Given the set of users S = {s1, . . . , sn} a user hierarchy U is a pair (P * (S), ≺) where P * (S) is the powerset without the empty set of S and ≺ is a partial order such that for all sets of users pi, pj ∈ P * (S), pi ≺ pj if pj ⊆ pi for all i, j = {1, . . . 2 n−1 }.
All user groups pi ∈ P * (S) with a non-empty object set such that are called busy user groups. These are user groups granted access to a set of objects specified by an access policy. These busy user groups might also change when adding or deleting users.
The data owner downloads the object set O(p orig i ∪ sn+1) and reencrypts it with a new key. We differentiate three scenarios where access rights are revoked from a user. First, a user is revoked from all access rights. Second, a user is revoked from a user group. Third, a user is revoked from certain objects of a user group. Consider the first scenario where all access rights of a user are revoked. The original user hierarchy changes as the set of users S is reduced by one element sn. This is to reduce The data owner updates the user group and virtual relation mapping according to the changes of user hierarchy and busy user groups to keep track of the changing users, user groups, and virtual relations. She also distributes the encryption keys to the respective users while updating their key stores when they are logged in. In most cases, changing keys implies that the data has to be downloaded and re-encrypted in a trusted environment. In particular, each onion layer of the adjustable encryption has to be removed and re-encrypted using a new key. This is time-consuming and increases the data traffic. A proxy re-encryption on the server would save this overhead but has to prevent the untrusted service provider from learning the new encryption keys, computing arbitrary proxy re-encryptions, or gaining information about the onion layers. Currently no solution exists, particularly for order-preserving encryption.

EXPERIMENTAL EVALUATION
We implemented ENKI as the extension of an existing single user solution to support the multi user setting. We use a modified JDBC driver for the single user mode which receives unencrypted SQL queries, modifies their operator tree, performs the onion selection, and encrypts the results [15]. As described in Figure 1, ENKI is an additional modification of the JDBC driver to perform query rewriting for the multi user mode and provides a client add-on to execute the post-processing. Our experimental setup consists of a server and a client. The server is a HANA database server with 252 GByte RAM and 8-core 2.6 GHz processor. It hosts an unmodified SAP HANA database. The client has 16 GB RAM and 2-core 2.8GH processor. It hosts a modified JDBC proxy and an ENKI Query Adapter as well as a SQL-lite database. The queries are executed on the unmodified SAP HANA database [10] where UDFs execute cryptographic operations. We implemented DETPRE in C using pbc and gmp libraries providing the mathematical operations underlying pairing based encryption [20,14]. We evaluate functionality and performance of ENKI on the TPC-C benchmark and three real world use cases described in Subsection 6.1. In Subsection 6.2, we analyze which types of queries and access policies can be supported. In Subsection 6.3, we evaluate the performance overhead consumed by the necessary modifications of ENKI.

Use Cases
IS-H. IS-H is the healthcare management solution of SAP for patient management. In our observed query trace, we see 7 tables  with 477 columns in total. As all tables contain personal information, we assume that all tables must be treated confidential. The users accessing this application are typically associated to different roles which are sets of organizational units. Patient information is associated with the organizational units of her encounters. To protect sensitive patient information, access policies prevent users from accessing medical details of patients if they are not associated to the set of organizational units of the patient. LSM. LSM is an internal SAP solution which supports facility management to plan resources. Peers on a certain SAP management level include confidential planning information for their area.
The peers are only allowed to access the data they committed themselves but not the data of other peers. Facility management has access to all data and calculates figures for future resource planning which are sensitive. Focusing on our evaluation, we use the access policy specified for the facility management such that a user participates in n user groups given n users. The application contains of 25 tables and 173 columns. TPC-C. TPC-C is an OLTP benchmark consisting of 9 tables and 92 columns. We assume that all tables and columns are sensitive and define an access policy for a two user scenario where each user has certain private data and other data is shared. SFIN. Simplified Financials (SFIN) is part of SAP ERP application relying on SAP HANA as a database backend. In our use case, this application analyzes consumers' data sets consisting of 9 tables and 741 columns. We assume that all tables and columns are sensitive and define an access policy for a two user scenario where each user has certain private data and other data is shared.

Functional Evaluation
We analyzed the applications described in Subsection 6.1 to evaluate which queries and access policies ENKI can support. Queries. Table 1 shows the issued query types for each application. ENKI supports all observed queries including equal and order selections, equal joins, aggregations and combinations of these. In addition, ENKI also supports update, insert, and delete statements. Therefore, ENKI provides enhanced functionalities compared to existing solutions [30,25,24]. ENKI cannot support the execution of range joins on the database server if a range join includes columns of different virtual relations encrypted with different keys. To our knowledge, there is no proxy re-encryption scheme for OPE encryption available. Therefore, ENKI would execute range joins only on client-side. However, we consider this as acceptable as we did not observe a range join in any of our four applications. Access Policies. ENKI supports the access policies specified for the IS-H and LSM application as its tuple-wise access restrictions on tables match well with the described requirements. This tuplewise access enables the implementation of most of the access policies specified by authorization views [26]. An exception are those policies which only allow aggregated views on columns e.g. a user is only allowed to see the average of all attribute values of a column but not the unaggregated attribute values.

Performance Evaluation
We investigate two questions in order to evaluate the performance of ENKI: • What is the performance penalty of our algorithm for the multi user mode compared to the single user mode?
• What is the performance impact of our proxy re-encryption scheme?
In the experiments to answer the first question we assume that the proxy re-encryption has already taken place. Compared to the single user mode where an encrypted query is executed on encrypted data, our algorithm rewrites, executes, and post processes an encrypted query for the multi user mode. We analyze the time consumed by query rewriting, query execution, and post processing to better understand the performance penalty of the multi user mode. Figure 5 shows the time consumption to rewrite unary, binary, and tertiary relational operations given the LSM access policy such that the number of involved virtual relations accessed by one user increases linearly with the number of additional users n. Figure 5 illustrates that the effort is O(n) for unary operations, O(n 2 ) for binary operations, and O(n 3 ) for tertiary operations.
Changing the access policy might increase the number of virtual relations assigned to a user such that she participates in more than n user groups. However, we did not observe such a worst case access policy in one of our use cases. In addition, we studied the literature, but did not find any case referring to this requirement [19]. In accordance with current literature, we even assume that the number of actual user groups is smaller than the number of users [19]. This implies that the evaluation of the LSM access policy with a linearly increasing number of user groups per user represents an upper boundary of the time consumption to rewrite unary, binary, and tertiary relational operations. We further analyze the execution time for a mix of relational operations given the LSM access policy where the number of user groups linearly increases with the number of users. Figure 6 presents the consumed execution time given an increasing number of user groups n = 50, . . . , 400. In the multi user mode, each additional user adds one more user group. Hence the query expands by an additional subquery for each additional virtual relation. This effort is reflected by the execution time ranging from 0.196 s for 50 user groups to 1.5s for 400 user groups. The query execution time in single user mode is nearly constant as the same query set is executed on a growing amount of data. Figure 7 shows the effort to post-process unary relational operations including aggregation functions over an increasing number of user groups n = 50, . . . , 400. These numbers contain the computational time for the client-server split as well as the necessary merge of the result sets on the client. Although the client-server split does not increase the computational complexity, we observe an overhead for post-processing which is moderately growing given an increasing number of user groups. On the client, the computations of maximum, minimum, and sum require n − 1 operations and the computation of average requires 2n − 1 operations. The time to compute sort depends on the number of virtual relations n, but also on the maximum cardinality of all invoked virtual relations. It is O(m log n) to merge n sorted lists with a total of m attribute values. The time to post-process the group by operation is also O(m log n), a merge of m groups of n virtual relations. In addition, it is O(m − 1) to aggregate the partial results if similar groups exist. Figure 4 shows the time consumption to rewrite, execute, and post process a mix of 20 select queries seen in the TPC-C benchmark and compares them to their execution time in the single user mode. Table 1 shows the query types of TPC-C. For the single user mode we execute the encrypted queries according to [25], i.e. without any access policy. For the multi user mode we use the same access policy as in the examples in this paper: there are two users and three user groups. Each user has access to his private data and both user have access to shared data. In order to execute a multi user mode query, we need to rewrite, execute, and post-process. We measure the time of these steps and compare their total to the execution time of the single user mode. Figure 4 presents the results for the 20 TPC-C queries. The multi user mode incurs an average overhead  Figure 4 illustrates that Query 18 (including sum operator) and Query 20 (including range condition) both consume a significant larger amount of execution time compared to all other unary and binary relational operations in the single as well as in the multi user mode. This is caused by our implementation of the respective encryption schemes, but their performance could be further optimized [25].
For the second performance question we conduct another experiment.
We measure the execution time of our new encryption scheme DET-PRE used to process count distinct, set difference, and join securely over data encrypted with different keys in the multi user mode and compare it to the encryption scheme Join-Adj used in the single user mode [25].
We present a micro benchmark in Table 2 which contains the time to compute the three algorithms of the scheme: encryption, token computation, and proxy re-encryption. The time to encrypt data is almost equal in both schemes with DETPRE consuming 1.5860 ms and Join-Adj consuming 1.6058 ms. The computation of the token consumes 0.03311 ms compared to Join-Adj with 0.0014 ms. The proxy re-encryption consumes 1.0191 ms in DETPRE and 0.0003 ms in Join-Adj. This proxy re-encryption time multiplies with the cardinalities of all columns which have to be proxy re-encrypted. It is possible to perform some computations in advance i.e. during the user logs in saving time during the execution. However, it is not possible to substitute DETPRE with Join-Adj in the multi user case as proxy re-encryption in multi user mode which privacy-preserves data must be non-symmetric and non-transitive. We see roughly a 40% increase on average per user group in multiuser mode for query rewriting, query execution, and post processing. We know from the literature that the total number of user groups scales linearly with the total number of users. In our experiments the absolute increase per user group was on average roughly 0.6 ms. If we assume that a user is not willing to wait longer than say 1 sec, we can accommodate 1500 user groups per query. This is sufficient for many practical examples as in our experiments. The total execution time in multi user mode consists of 4% for query rewriting, 82% for query execution, and 14% for post processing. As the percentage of query execution is most significant, we will focus on its optimization strategies in future work. If a query requires proxy re-encryption, our experiments show an absolute increase of roughly 1 ms per proxy re-encryption for one item. This number needs to be multiplied by the number of nonnull rows, i.e. 100 sec for 100.000 items and 16.5 min for 1.000.000 items. We propose to perform the proxy re-encryption in advance saving time during the query execution. While the user is logged in, these DETPRE values can be easily cached, but will not necessarily be persisted after the user logs out. Another option is to pre-compute the DETPRE values based on an expected set of queries.
In conclusion, our system scales well to a realistic number of user groups, but for large-sized databases proxy re-encryption should be persisted.

RELATED WORK
Queries over encrypted data. Encryption schemes supporting certain relational operations include key word search [28] or range queries [2,5,23,18]. Some works provide data confidentiality using tuple-wise encryption and execute queries using indexes organized in buckets [16,8]. Ciriani et al. satisfy privacy-constraints by (partial) encryption and fragmentation of data and rely on the application logic to process a query [6]. Popa et al. introduce adjustable query-based symmetric encryption to process queries on server-side [25]. Tu et al. propose an extension of this system to execute complex queries e.g. nested subqueries as seen in the TPC-C Benchmark by partitioning the query execution between server and client. We see no obstacle to combine this technique with the client-server split introduced by ENKI. Query processing with multiple keys without sharing data is presented in [25] and using searchable encryption in [30,24,3]. Ferretti et al. introduce a proxy concept to handle multiple users in the CryptDB setting, but do not present an experimental evaluation or a security analysis to proof their claims [11]. Joins over encrypted data. Deterministic encryption schemes that offer symmetric and transitive proxy re-encryption are presented in [22,25]. Hacigumus et al. require extensive query rewriting to compute joins [16]. Agrawal et al. propose an interactive approach [1]. Furukawa et al. provide a non-transitive and nonsymmetric approach to compute a join such that a probabilistic encryption is degraded to be deterministic in the single user mode [12]. The encryption scheme presented in [3] also handles join operations, but no confidentiality guarantees are provided. Access Control. A system for encryption enforced access control for outsourced data is proposed in [9], but this solution does not support query execution on encrypted data. Rizvi et al. introduce authorization views that enable the specification of access policies using SQL queries on the application level [26]. This restricts the access of users but does not prevent a service provider or an intruder from learning the data stored on the database. Key Management. The key management strategies introduced in [4,7] can be combined with our access control model. ENKI can benefit from these strategies by a reduced number of keys a user has to store.

CONCLUSION
This paper presented ENKI, a system for securely executing relational operations on encrypted, access restricted data. ENKI introduces an encryption based access control model to enforce access restrictions on encrypted data using different encryption keys. ENKI uses query rewriting and post-processing to process relational operations over data encrypted with different encryption keys. It applies a newly introduced encryption scheme to execute the relational operations count distinct, set difference, and join while protecting data confidentiality. Our evaluation shows that its performance depends on the specified access policies and on the type of relational operation. It achieves modest overhead for the select queries of the TPC-C benchmark and the LSM use case. PROOF. Assuming that an adversary can solve the described security game correctly, we construct a polynomial time algorithm which can solve the underlying problem of the l-Bilinear Diffie-Hellman Inversion assumption. This algorithm receives an instance of the l-BDHI problem with G a , G a 2 , . . . , G a l ∈ G1 and has to compute e(G, G) 1 a = g 1 a ∈ G2. Setup. Receive an instance of the l-BDHI problem as p, e, G1, G2, G, g, G a , G a 2 , . . . , G a l Choose di ∈ Zp uniformly at random. Run algorithm T oken to compute T oken(di, dj ) with i, j = 1, . . . , n. Send system parameters p, G1, G2, e, G, g, encryption keys d1, . . . , dn−1, and tokens T oken(di * , di) with i * = 1, . . . , n − 1 and i = 1, . . . , n to A. Phase 1. A performs the following actions: Enc A runs algorithm Enc to encrypt arbitrary messages m with keys d1, . . . , dn−1. To encrypt message m with encryption key dn, which is not known to A, the adversary exploits its knowledge of encryption keys d1, . . . , dn−1 and tokens T oken(di * , dn) to compute

REFERENCES
Using this result, A computes G mdn . Pre Adv runs algorithm P re to proxy re-encrypt ciphertext C encrypted with key di * with i * = 1, . . . , n − 1 to be encrypted with key di with i = 1, . . . , n. Challenge. A chooses a key d / ∈ {d1, . . . , dn} and sends it to C. C picks a valid ciphertext as C = Enc(m, k) = Gm a = G r (50) and sends C = G r to A. C asks him to guess the proxy reencryption of C to key d as Phase 2. A performs further actions as described above. Guess. A returns its guess for V as V ′ to C. C computes to solve the instance of the l-BDHI problem as The probability that this algorithm solves the l-BDHI problem is the same as the advantage of the adversary in the security game. It is P r[V = V ′ ] = ǫ. If the l-BDHI assumption holds, this advantage can only be negligible. Therefore, the adversary can only achieve this attack with a negligible advantage.