Secure Cloud based Privacy Preserving DataMinning Platform

,

To address, the Digital signature scheme is adopted by various methods. The Digital signature with public-key cryptography is first invented by Diffie and Hellman [14] and was first adopted by Rivest, Shamir and Adleman [15]. Digital signatures consist of following setup such as Key Generation process which produces secret/master key and public key, Signing process which generates a signature by using secret/master key for a data and Verification process checks whether some signature on data using public key. This model is not applicable in practice since the signature is valid if one-way function is exist [15] [16] i.e. the signer key need to be changed during the execution process of signing process. Therefore a random signature generation scheme is required but it is not suitable for plain text scheme [17] as a result they are not provably secured. In [18,19] presented first signature model that support practical implementation with strong RSA assumption. The model is not evaluated under Ideal randomness assumption and designed for specific standalone application. Therefore, the key is to develop an anonymous access ticket (credential) signature generation model.
In anonymous access ticket model the Requestor access the resources by presenting an access ticket that he is authenticated. The real example of this scenario is the Passport, Driver license etc. as access ticket which permits the Requestor to prove ownership/ citizenship that authorize the Requestor to vote, drive vehicle etc. In digital environment the Requestor identity is his master key and the access ticket as a signature on this Master key. An access ticket model is said to be anonymous if it allows Requestor to get access ticket without revealing more information of its identity. By adopting zero-knowledge proofs we can achieve this but it induces additional computation which is expensive. Since it requires the Boolean circuitry to prove the statement is true. Many existing model adopted two party computation models to allow Requestor to be anonymous. This also induces computation overhead due to Boolean circuitry gat by gate verification. To address these in this work we consider that an anonymous access ticket model can allow for a setting in which Requestor can have varied scenario and varied pseudonyms. Most of the existing access ticket signature model has adopted signature generation based on public key cryptography model which induces high computation overhead. Especially when it is deployed on cloud platform it increases cost.
Here we present a key based access key verification model which allow the Ticket Provider (TP), and Ticket Checker (TC) to share admittance to the some master keys, then Elliptical Curve Cryptography (ECC) is adopted to validate that the TP and TC belong to similar group. As long as the Requestor does not required to be authenticated by other Requestor the TP can share distinct master key with all the TC and then at same time it gives access tickets to all TC. In a non-interaction or a standalone model the TP must give access ticket to all TC at same time which is not suitable when large number of TC exist in a system. We consider an online cloud based model where the Requestor wishes to get access ticket from the TP whenever he is desired. The blind issuance adopted in our model allows the Requestor to get to access ticket on same attributes without revealing it to the TP, i.e. in an anonymous way. Our model allows to translation of publically access ticket verification into an efficient keyed based access ticket verifiable with same function and attributes. The key challenges of secure resource provisioning in public cloud environment [20,21] such as availability, data confidentiality, security, communication bottleneck, inside attack, outside attack, privacy consideration for processing of sensitive data [22,23], [34].

Motivation and Contribution
The adoption of cloud computing environment by various applications has raised numerous security issues such as Denial of Service (DoS), malicious insiders and data loss which have been broadly analyzed in [20,21]. These issues exist mainly due to loss of control over data, multi-tenancy and trust. The major public cloud service provider such as Microsoft Azure, Goggle clouds and Amazon simple storage services namely Amazon S3 do not assure in service level agreement specific to security and privacy as part of contractual agreements condition among cloud consumer and cloud service provider. Therefore it is critical and necessary concern in providing security and privacy when cloud platform is adopted. Therefore when designing a secure resource sharing model using cloud platform the security and privacy concern of all parties involved in the model must be taken care off.
The model presented here achieves anonymous access ticket sharing among different parties for secure access of resources stored public cloud platform. The proposed model provides party based security provisioning with reduced computation overhead by adopting key based encryption technique.

Related Work
With the rapid growth of online based service delivery such as social media, healthcare and so on. These platforms adopt cloud technology which store data of user in cloud storage for various purposes. Therefore providing security and preserving privacy of data is most desired. There have been several methodologies that have been proposed in recent time in order to improve the security and privacy of data that are stored in remotely or on public cloud environment which are surveyed below.
In [25] considered the issue pertaining to secure sharing of huge multimedia data and its privacy. The main concern of their approach is that how to preserve privacy for such huge volume and variety of data and how to enforce privacy policies to provide secure and efficient resource management. The research survey conducted by them shows that the privacy policy enforcement should be done in access control model. They presented a hybrid model namely Intelligent Privacy Manager (iPM) where the requirement of privacy is implemented in access control mechanism and presented an enforcement and composition of privacy policies framework. They considered permitting only the Requestor to provide a conflict free privacy rule and not security manage or architecture. They also considered their model to be context aware and proved privacy policy correctness. To ensure providing security of sensitive data and enforce privacy policies a verification step is performed. The implantation model did not consider the cloud platform and there is no simulation proof to show their model reduces computation cost.
In [26] they considered a privacy preserving model for encrypted data that are stored in cloud environment by the user. Since the data stored on cloud platform is of encrypted form there is a need of secure and efficient search mechanism of cipher text. The area of concerned here is how to categorize document during encryption which is a crucial factor for retrieval result accuracy. The increase in storage of huge variety and volume of data on cloud storage environment brings added challenges in providing efficient online large encrypted resource retrieval model. Here they presented an hierarchical clustering based semantic information retrieval system of large encrypted data. Their model used minimum relevance threshold to cluster the document. Their model is solving linear computation complexities issues and to authenticate search result privacy they adopted hash sub-tree. They conducted experimental study on IEEE Xplore built datasets. The outcome of their result show better rank privacy then its counter parts. They did not considered query encryption as a result the prone to chosen cipher attack.
In [27] presented a architecture namely CypherDB to protect the data confidentiality that are store in external cloud storage environment. Their model enables an efficient and low computation model to retrieve resource from cloud database storage environment. The entire data are encrypted and are stored in cloud database which can be retrieved securely by using their CypherDB secure processor architecture. To avoid data leakage and optimizing computation efficiency their architecture data paths is tightly coupled during query execution and database access. Simulation is conducted on TPC-H database and compared with wellknown SQLite database management architecture. An average reduction of 10% is achieved considering without security. They adopted symmetric approach using 128-bit security keys which does not incur much overhead at the same time security is compromised as compared with other public-key cryptography. Any changes in architecture is difficult to achieve as a result their model is not robust.
In [28] presented a framework to address the issue pertaining to data sharing on cloud based big data environment. There is been increased personal data storage in big data environment. Here they presented secure data sharing, storage, processing and destruction on cloud hosted virtual machine. They adopted proxy re-encryption technique for cipher generation by using AES symmetric algorithm on a semi trusted server. Their model preserves the privacy of data owner in protecting sensitive information. It also retains complete control of data owner data. The drawback with their model proxy key generation is done on semi trusted server which is protected by TPM key. As a result any party misbehaving will result in loss of privacy and SLA (Service level agreement). In [29] identified the computation complexity issue pertaining in designing and comparing cryptographic mechanism to preserve privacy of data. Here they compared some of existing approach such garbled circuits, secret sharing and homomorphic approach. They evaluated the performance efficiency of each model in order to evaluate the efficiency claim by existing approaches. To evaluate a fair comparison among these models the following performance metric is considered. The parameters are bandwidth requirement, run-time and complexity. Here they considered well known two party mechanism on semi trusted environment. The performance of these metric are evaluated in following stages, initialization, online computation and pre-processing considering identical platform. The overall outcome of their model shows pros and cons of different model.
The overall survey shows that preserving privacy of data in cloud platform is most desired. The existing lacks in providing efficient privacy preserving of data owner due to computation complexity. Therefore there is a necessity to develop a zero knowledge proof based access ticket mechanism for public cloud environment and reducing the cloud expense/cost. The paper organization is as follows: The Related work is presented in section two. The research background work is presented in section three. In section four the problem definition is identified. The proposed models are presented in Section five. The simulation results and the experimental study are presented in the section six. The concluding remark is discussed in the last section.

Research Method
In [30] presented multi-authority CP-ABE model for secure data access in cloud environment. Their model adopts is presented in random oracle model which is considered to be CPA-secure considering constant cipher text under q-Bidirectional Diffe-Hellman encryption (q-BDHE) assumption. To reduce the computation complexity of data owner the computation is outsources to cloud environment. Similarly in [31] presented CP-ABE access control mechanism by using homomorphic encryption for public cloud environment for secure sharing of multimedia data. Both these protocol achieves significant performance improvement over existing approaches. The main drawback of their strategy is they induce high cloud expenses since the proxy key generation is computed on cloud environment. When the number of group is increased the cloud computation cost also increases linearly.
Given a Cloud based model with N set of group/Requestors, the objectives are given as follows: 1. To develop a secure public cloud platform to store mining data using security services 2. To develop a key based access ticket verification model. 3. To evaluate the performance of proposed model interm of cloud computation cost, processing time 4. To evaluate performance by varying the attributes size. 5. To evaluate performance by varying the group/requestor size.

System Model
The proposed secure public cloud platform consists of following parties, Requestor, Knowledge Base Provider (KBP) and Privacy Preserving Data Mining (PPDM) platforms. The KBP consist of following parties Ticket Provider which act as a third party between Requestor and the PPDM cloud platform and PPDM consist of following parties Ticket Checker, Validator and Revocator. The Proposed Secure Cloud Privacy Preserving Data Mining (SCPPDM) architecture is shown in Figure 1. The Requestor, obtain access ticket from different TP and collect information of access ticket pertaining to a specific TC. The Requestor is considered to be in local system or mobile platform or on trusted cloud platform.
The TP provides access ticket to Requestor, which assures the correctness of access ticket information for a particular Requestor to the person the ticket, is given. The TP must first authenticate the Requestor before giving the access ticket. The authentication is done in presence of user physically or remotely through online by collecting information such username and password etc.
The TC provides a secure access control mechanism to access services or resources. It restrict the access tickets that Requestor must possess and also restrict the amount of The Revocator is accountable for access ticket revocation. It prevents the access ticket in generating of specifier ticket. The access ticket cannot be used by any TC for any purposes, if the TP execute access ticket revocation from Revocator. Similarly the revocation is considered to be with in TC and does not affect the specifiers of other TC, if revocations are executed by the TC. The party such as the Requestor and the TC must first obtain the most recently updated revocated data from the Revocator to compute, respectively validate, ticket specifier.
The Validator is a trusted party which preserves privacy of specifier tickets under certain conditions. To assure the TC first agree in the specifier rule in which Validator should be accessible to attributes under certain conditions. Therefore the Requestor has knowledge of privacy preserving when the ticket is computed and can take part to achieve it. This helps the Requestor to make desired resolution based on their requirement and trust of Validator.

A. Keyed Verification Access Ticket
is a system specific parameter ( ) that is available to models and guarantee all parties it is generated correctly.
is computed by TP on given input to compute TP parameter and master key . ( ( )) is an possible collaborating mechanism where Requestor can obtain access ticket on attributes ( ) form TP which gives Subset of its attributes.
( ( ) ) is an collaborating mechanism among Requestor and a TC. View is invoked by Requestor to assure evidence of ownership of access ticket attesting some set of attributes ( ) that fulfil a set of conditions under key resultant to . The is invoked by the TC in ownership of to check that the Requestor has a access ticket for attributes fulfilling the condition .

Keyed Verification Access ticket from SCPPDM
The proposed access ticket model from . We now compute the following process mechanism for access ticket model. Here demonstrates that ̅ is a usable SCPPDM with respect to access ticket model and the TP param. The Requestor accepts the output ̅ if its proof is verified or else rejects with output . An alternative way to hide some subset of { } attributes from TP, the following procedure is followed The user generates the Elliptical curve cryptography (ECC) key pair , then for each attribute generates an encryption as , using . Then the Requestor transmits the proof of knowledge { } and ciphertext to the TP. The TP selects and then evaluates process encryption ̅ of ̅ using homomorphic properties and the encryption is randomized by multiplying the encryption of using randomness ̅ to get ̅ ̅ . Then it transmit ̅ ̅ to the Requestor and provides a proof (the randomizing factor ̅ and proof of knowledge of correctness { } ̿ ) that param computed is correct with respect to ( { } ). The Requestor decrypts ̅ ̅ to obtain ̅ , and output ̅ else output with if it fails to verify the proof.

Access Ticket Translation
In spite of proving that the cipher data are well formed, the requestor can add proof of attributes that the cipher data encodes. As a result the Requestor can prove that some of attributes are similar in another access tickets which aid in reducing computation overhead resulting in increase in cost.
Access Ticket Specifier: Here the computation of View and ViewCheck is presented. Here we present how the access ticket attributes is certified for the given set of assurance. The Requestor can prove large type of condition about assured param once assurance has been established. If any attributes is exposed to the TC, the Requestor sends in position of and the TC directly evaluate . To reduce the computation cost of Requestor the proof of knowledge of is not considered here in our model. The TC can re-compute the SCPPDM and compare it with assured param by using homomorphic properties of the assurance. Though there is an added criterion in SCPPDM the TC evaluates, due to randomness in assurance. To evaluate precise param required to omit these criteria and to prove the generated param is correct, the Requestor will use ⃗ from .

Results and Analysis
To evaluate the performance of proposed SCPPDM model the experiment is conducted on following environment. The model is implemented on windows 10 enterprises operating system, I-5 class quad core processor with 8GB of RAM. The SCPPDM model is developed using C# and Microsoft Dot net Framework 4.5 and above, The SCPPDM model is deployed on D2 instance Microsoft Azure public cloud storage platform. The D2 instance charges its user 16.82 rupees (₹)/hour. The experiment is conducted to evaluate computation overhead by varying ticket and attributes size.  In Figure 2, the computation time of proposed model to generate ticket and issues ticket to Requestor for varied tickets is shown. The ticket size is varied form 10, 50, 100, 250 and 500. The computation time for 10 ticket is 4.01 seconds, for 50 is 14.92 seconds, for 100 is 29.16 seconds, for 250 is 70.92 seconds and for 500 is 133.78 seconds. The experimental outcomes show when the ticket size is increased the computation time is also increase. The outcome show the proposed model is scalable irrespective of application since it can generate and issue large number of ticket.
In Figure 3, the computation cost of proposed model to generate ticket and issues ticket to Requestor for varied tickets is shown. The ticket size is varied form 20, 40, 60, 80 and 100. The computation cost accessing resource on Microsoft azure D2 instance charges 16.82 rupees (₹) per hour usage. The computation cost for 10 ticket is 0.026 ₹, for 40 is 0.0509₹, for 60 is 0.0738 ₹, for 80 is 0.098 ₹ and for 100 is 0.13 ₹. The experimental outcomes show when the ticket size is increased the computation cost is also increase. The outcome show the proposed model is cost effective considering varied ticket size. In Figure 4, the computation time of proposed model for varied attributes is shown. The attributes size is varied form 10, 20, 30 and 40 and ticket size is kept constant with 100 tickets. The computation time for 10 ticket is 29.06 seconds, for 20 is 29.16 seconds, for 30 is 29.78 seconds and for 40 is 30.14 seconds. The experimental outcomes show the proposed model computation time is not affected by attributes size since the computation time is in range of 29.06 seconds to 30.14 seconds.
In Figure 5, the computation time and computation cost of proposed model for varied attributes is shown. The attributes size is varied form 25, 50, 75 and 100 and ticket size is kept constant with 500 tickets. The computation time and cost for 25 tickets is 133.02 seconds and 0.62₹, for 50 is 133.336 seconds and 0.622₹, for 75 is 133.67 seconds and 0.624₹ and for 100 is 134.78 seconds and 0.629₹. The experimental outcomes show the proposed model cost of computation depends on amount of time it takes to compute on azure cloud.

Conclusion
Providing security and preserving privacy of user access to data stored on public cloud platform is most desire. The existing models presented by C. Yanli et al. and K. Yang et al. reduces the computation time but the cost of computation for user on cloud platform increases and the authentication model presented by them is specific to particular application as result it is not applicable for secure authentication for variety of applications service that exists currently. To address this here we presented a secure and efficient privacy preserving of mining data on public cloud platform by adopting party and key based authentication strategy. The experiment is conducted to evaluate computation time and computation cost by varying number of ticket and attributes. The outcomes show the proposed model reduces computation time and cost consider azure public cloud platform. The proposed model is not affected by the attributes size which shows the robustness, scalabity and computation efficiency of our SCPPDM model.