Peer Prediction-Based Trustworthiness Evaluation and Trustworthy Service Rating in Social Networks

With the development of online applications based on social networks, many different approaches have emerged to evaluate the service that these applications provide. Reports made by end users regarding the consumer’s experience or opinion are commonly used to rate the quality of different online services. Therefore, ensuring the authenticity of the users’ reports, and the detection of malicious users’ dishonest reports, have both become important issues to achieve accuracy in the rating of such services. In this paper, we propose and evaluate a private-prior peer prediction-based trustworthy service rating system, which requires users to report their prior and posterior beliefs regarding whether their peers will report a high-quality opinion of the service. The reports are made to a data processing center which evaluates the users’ trustworthiness by applying a strictly proper scoring rule, and removes reports received from users whose trustworthiness rating is low. This peer prediction method is compatible with incentives to motivate users to report honestly. In addition, an unreliability index is proposed to identify malicious users, and malfunctioning or unreliable users who have a high error rate in making judgments about quality. Thus, reports with high unreliability values will also be excluded from the service rating system. By combining trustworthiness and unreliability, malicious users face the dilemma that they cannot receive both a high trustworthiness and low unreliability rating simultaneously when their reports are false. Simulation results indicate that the proposed peer prediction-based trustworthy service rating can identify malicious and unreliable behaviors effectively and motivate users to report truthfully, and that a relatively high service rating accuracy is achieved by the proposed system.


I. INTRODUCTION
Information communication and computation technologies have been developing rapidly in recent years. With the growing demands of big data and development of different applications, the emerging fifth generation (5G) mobile communication technology will be a multi-service and multi-technology integrated network, which can enhance the user experience by providing various intelligent and customized services [1]. Moreover, social networks have become important platforms for users to enjoy different kinds of online services. With the rapid development of Internet-based applications, different approaches to achieve these applications have emerged. Take e-commerce for an instance, in which users are allowed to use different online or mobile payment systems, such as PayPal, Google Wallet, Alipay and Apple Pay, to complete payments. In addition, for some file sharing applications, users can use different downloaders to download their favorite music, movies or other media files.
In order to provide accurate and useful suggestions to new users and help them to make choices, the use of service quality ratings for these different services has become an important method [2], [3], [4]. Concerning this problem, the feedback and evaluation from users who have experienced a service provide essential reference information for the service rating [5], [6], [7]. Meanwhile, social networks provide platforms that collect and share users' feedback, according to which the service rating can be provided through some data fusion mechanism. However, false and dishonest reports from malicious users can destroy the fairness and usefulness of such ratings. Therefore, it is rather necessary to introduce some trust assessment function to such systems and design an incentive mechanism to motivate users to output truthful feedback.
In this paper, we will establish a peer prediction based trustworthy service rating system for social networks. With peer prediction based decision, network functions of malicious behavior detection, trustworthiness and unreliability assessment can be achieved. Then the reliable and trustworthy service ratings can be obtained by the feedback from honest and reliable users. In this work, we assume that the service quality is an objective evaluation independent of users' subjective judgements. This assumption is reasonable for many service quality indicators, such as convenience of online payment methods and download speeds [8], [9].

A. Literature Review
Service ratings for different application systems have been active research topics over the past decades. Many service evaluation systems have been developed for mobile social networks [10], multiple providers service systems [11] and many other kinds of web services [12], [13]. In [14], researchers designed the objective rating scores of products or services through an iterative rating algorithm. This rating mechanism entirely decoupled the credibility assessment of the evaluations from the ranking itself, which makes it very robust against collusion attacks as well as random and biased raters. A twophase methodology was proposed in [15] for systematically evaluating the performance and availability of cluster-based Internet services. A service rating scheme that is robust against manipulations by malicious users and services was proposed in [16]. In [16], the service rating made by the target customer was predicted, based on which the system helped this customer to choose a suitable service. The authors of [17] proposed a user-service rating prediction approach for the recommender system by exploring social users' rating behaviors. In [17], the user's social relationships were considered in order to understand social users' rating behavior diffusions.
A social network is a platform that allows its users to obtain services and share their experiences [18]. Based on such feedback gathered, a data processing center (DPC) can provide quality ratings for different services, which can further give suggestions for new users. To ensure the accuracy of service ratings, the trustworthiness and reliability of the feedback from users need to be checked and ensured. Currently, trust and reputation management has become a challenge in many kinds of feedback and decision systems. Many trustworthiness evaluation mechanisms have been proposed for social networks [19], [20], wireless sensor networks [21], [22] and cloud-based service systems [23]. To motivate secondary users (SUs) in a multiple channel cognitive radio network to report truthfully, a Stackelberg game model was designed in [24], according to which trustworthy SUs gain transmission opportunities as rewards. A consumer feedback based service rating system was presented in [25] to evaluate the trustworthiness of a cloud service. In [25], a novel protocol was proposed to improve and ensure the credibility of trust feedback from consumers. In [26], a dynamic trust evaluation model was proposed to evaluate the user's reputation. The authors of [26] considered both users' preferences for different quality of service attributes and the impact of vicious ratings on trust evaluation. For rating the reputation of the service, different users' ratings were weighted dynamically according to their honesty assessment, and the influence of malicious ratings were thus effectively diluted.
Most of the local and global trustworthiness evaluation methods mentioned above are established by users' own current and/or past behaviors. Further, some researchers have considered relationship and interaction among users of a network for user trustworthiness assessment and prediction [27], [28], [29], [30], [31], although the incentive mechanisms for truthful information are not studied much. Originally applied in electronic commerce, common-prior peer prediction with a strictly proper scoring rule [32], [33] was proposed for truthful feedback from users in [34]. To be specific, Peer Prediction refers to a scheme using one user's report to update or predict a probability distribution for the report of someone else, whom we refer to as the "peer". The former user is then scored not on a comparison between the likelihood assigned to the peer's possible ratings and the peer's actual rating. Moreover, in the common-prior peer prediction mechanism, the prior probability of the product type or service quality is commonly held, conditional on which, the probability distribution of user's received product type or service quality is also common knowledge. Relaxing the assumption of commonprior, the authors of [35] modified the classical peer prediction method such that only users' subjective and private opinions were needed, and this trustworthiness evaluation mechanism is known as private-prior peer prediction. Both of these two peer-prediction methods estimated the trustworthiness using strictly proper scoring rules, which can provided incentives for truthful reporting. The peer prediction mechanism can be applied efficiently in the scenario where the prior knowledge is subjective and private to each users. For instance, peer prediction has been used in wireless sensor networks [36], [37], cognitive radio networks [38], [39], social and online systems [40], [41] and many other kinds of crowd-sourcing systems [42], [43] to collect truthful reports from users, and has been considered as an effective solution to elicit trustworthy feedback. In this paper, we propose a service rating system for social network based services according to honest users with high trustworthiness. Private-prior peer prediction is introduced to evaluate users' trustworthiness and motivate users to provide truthful feedback.

B. Contributions and Organization
The main contributions of this paper can be summarized as follows.
• We introduce private-prior peer prediction in the service rating system of social networks. The user trustworthiness obtained through certain strictly proper scoring rules is formulated to motivate users to report trustfully. We analyze the incentive compatibility of the basic peer prediction mechanism with respect to the false alarm and missed detection probabilities of judgement and report. • We propose an unreliability index to eliminate unreliable reports from the service rating system. By applying the unreliability index, malicious users are confronted with a dilemma that they cannot get a high trustworthiness and a low unreliability at the same time when they provide a false report. However, the best choice of honest users is still reporting truthfully even for poorly functioning ones with high error rates of judgement. • Based on the proposed user trustworthiness and unreliability index, we design a service rating framework. In this framework, trustworthiness is used to evaluate the possibility of whether the subject user's report is dishonest and the user is a malicious one. On the other hand, the unreliability index is introduced to determine whether the reports are reliable, but does not consider the type of the users, i.e., honest or malicious. By removing the feedback reports with high unreliability and reports received from users with low trustworthiness, from the final rating procedure, an accurate and trustworthy service rating can be achieved.
The rest of this paper is organized as follows. In Section II, the system model is described. The private-prior peer prediction based user trustworthiness evaluation for motivating truthful reports is proposed in Section III. Then we analyze the reliability of users' reports and design the service rating system in Section IV. Simulations are presented in Section V, and conclusions are drawn in Section VI.
II. MATHEMATICAL MODEL FOR SERVICE RATING BASED ON USER REPORT FUSION With the boom of online applications based on social networks, different services to support these applications have emerged [44], [45]. As mentioned previously, users are allowed to select different online payment methods to complete their online purchases, or download their favorite music and movies by downloaders those they think are faster and more reliable. To rate the quality of different services, users are required to report and share their consumption experiences or opinions to the social application platform, which can use this valuable feedback for service rating and helping new users to judge whether the applications can provide high quality services. In our work, the quality of the services is considered as an objective evaluation independent of users' subjective judgements. For instance, different users tend to have the same opinion about whether a payment system is convenient or a downloader has a high download speed. Such a social rating system is different from systems such as movie review, in which users' subjective opinions and standards may vary considerably between individuals.
In this case, users' truthful feedback of a service is important for achieving an accurate rating of this service approach's quality and providing helpful suggestions to new users. However, some malicious users in social networks provide untruthful evaluations of the service quality for some purposes. On the one hand, malicious users report to the service rating system that the object approach of service is high in quality when it gives a bad service performance to improve its competitiveness. On the other hand, malicious users report a low service-quality evaluation to lower the rating of the object approach of service, which will encourage new users not to select it. These malicious behaviours undermine the fairness of the service rating and provide unreliable suggestions to new users. Therefore, it is important to make sure that the feedback from users is truthful.
In this work, we design a mechanism to provide incentives for truthful opinions of users. Moreover, we define a trustworthiness management method to identify malicious users, excluding whose untruthful feedback, the service rating with high accuracy can be made.

A. System Model
Consider a population of N users distributed over a social network with a service platform, which can provide different approaches of this service. Quality Q of the service is a binary rating, which is considered as a random variable represented by {l, h} referring to the low quality and high quality, respectively. As mentioned previously, this quality is an objective fact. In other words, after experiencing the service, different honest users tend to give the same evaluation or opinion independent of their individual subjective standards. As shown in Fig. 1, each user i (i = 1, 2, · · · , N ) accepts the service m, and then makes a binary opinion of the service quality denoted by S i = s i ∈ {l, h}. Meanwhile, users are allowed to provide some required QoS reports to the cloud, and these reports will be processed by the DPC. For instance, the opinion report denoted by x i ∈ {0, 1} is generated by applying a report strategy r i : S i → {0, 1}. User i will report x i = 1 when S i = h (or x i = 0 when S i = l) to the cloud if he/she is honest. We assume that x i is the semi-public information published to the social service-evaluation platform by the cloud, and can be observed by the DPC and other users over the social network and having the friendship with user i. In addition, S i is the private or local information only known by user i, and other users and even the cloud cannot get it.

B. Service Rating Based on User Report Fusion
Define the false alarm of the judgement as that the service is misjudged as a low quality while it is a high quality in fact, and the user i's false alarm probability of judgement is denoted by P f a,i = P (S i = l |Q = h ). In order to simplify the expression, let P f a to denote the false alarm probability of judgement. On the contrary, define the missed detection probability of judgement as P md,i = P (S i = h |Q = l ). Similarly, we use P md to denote the missed detection probability of judgement for a simpler expression. As mentioned previously, the quality is an objective fact, which leads that both honest and malicious users trend to make the similar and accurate judgement for it. So we assume that P f a < 0.5 and P md < 0.5 hold for all users in the social network.
On the other hand, we consider the false alarm of the report as that a user reports a low quality evaluation to the cloud when the quality of service is high, and the user i's false alarm probability of report is In addition, the simplified expression of the two probabilities of report above are P f and P m . We consider that the user type are represented by θ i ∈ {0, 1}, i.e., θ i = 0 if i is an honest user, and θ i = 1 if user i is the malicious otherwise. Assume that if user i is malicious, his/her false alarm cheating rate is P c f,i ∈ [0, 1], and the missed detection cheating rate is P c m,i ∈ [0, 1]. We assume that the honest users always report their real judgement of the service quality, no matter whether his/her judgement is accurate. Then for each user i, we have As shown in Fig. 1, based on the users' reports received by the cloud, the DPC can obtain trustworthiness T i of each user and make the decision of the service rating by applying the following rule: where n is the threshold of service rating. In (3), T = {i |T i ≥ t } is the set of honest users with high trustworthiness T i , which is determined by the threshold of trustworthiness t. A simple decision making rule is that the DPC rates the service as high, i.e., Q = h, only if more than half of trust users report the service's quality is high, i.e., n = |T | /2.

III. PEER PREDICTION FOR USER TRUSTWORTHINESS
In this section, we will introduce the private-prior peer prediction method, which enable to encourage users to provide rating reports truthfully. With some certain strictly proper scoring rules to estimate the users' trustworthiness, the mechanism can identify malicious users those with low trustworthiness. Then users are motivated to report truthfully in order to obtain high trustworthiness and avoid being considered as the malicious.

A. Private-Prior Peer Prediction Mechanism
Private-prior peer prediction is an incentive compatible mechanism originally proposed to motivate agents to report their private prior and posterior signal belief on electronic commerce [35]. In the basic private-prior peer prediction mechanism, each agent i coupled with his/her peer agent j = i + 1 is required to report his/her prior and posterior signal belief of the state before and after observe the signal, respectively. According to the two reports, the agent i's score can be calculated by a strictly proper scoring rule, which will be introduced in the later part of this section.
1) Prior belief reports to the cloud: In the system established in this work, any two users accepting the same service can be considered as a pair of peers, which establishes a kind of friendship and topology of all users. For rating the quality of service m, we consider that each user i has one peer user j selected randomly from other users who have accepted and will still accept the same service m as i. Before experiencing the service, user i is required to report his/her prior belief y ij ∈ [0, 1], or called information report, to the cloud that his/her peer user j will report a high quality signal, i.e., x j = 1. Then y ij can be given by In (4), can be obtained by the previous report x j of user j released among the network. P i (x j = 1 |Q = h ) represents the probability that user j gives a report of "high quality" evaluation for the service when user i makes a high quality judgement to the same service, i.e., S i = h. This judgement is a private and local information only known by user i, and the prior belief . Therefore, we can get the second equivalence relation in (4) established by .
2) Posterior belief reports to the cloud: After experiencing the service, user i makes his/her own opinion of the service quality S i = s i , and then sends the posterior belief, or called prediction report to the cloud, denoted by y ′ ij (s i ) ∈ [0, 1], that the peer user j will report of a high quality evaluation for the service. Then y ′ ij can be expressed as Similar to the previous analysis, y ′ ij (s i ) can be decomposed into two conditions as follows. where As defined previously, y ij is the user i's prior judgement that x j = 1 before user i experiences the service. After user i experiencing the service and sensing that s i = h, it is reasonable for user i to make the judgement that x j = 1 with a larger probability, i.e., y ′ ij (h) > y ij , which means that i's prior belief x j = 1 will be "boosted". On the contrary, y ij > y ′ ij (l) if user i receives a low-quality service. However, when there are malicious users providing untrustful evaluations of the service, the relation of inequality above cannot always satisfied. Lemma 1 provides the sufficient conditions which can ensure Lemma 1. In the private-prior peer prediction mechanism, for each user i with prior and posterior belief reports y ij and y ′ ij of user j, it holds that y ′ ij (h) > y ij > y ′ ij (l) if all users satisfy that P f a + P md < 1 and P f + P m < 1.
Proof: See Appendix. Remarks: As been assumed that P f a < 0.5 and P md < 0.5, condition P f a +P md < 1 always holds for all users. According to (1) and (2), for honest user i, i.e., θ i = 0, we have P f,i + P m,i < 1. On the other hand, for dishonest user i (θ i = 1), whether P f,i +P m,i < 1 can hold depends on his/her false alarm cheating rate P c f,i and missed detection cheating rate P c m,i . Notice that outright malicious users with relatively high P c f > 0.5 and/or P c m will have high P f > 0.5 and/or P m > 0.5, respectively. Users with both/either of the two cheating behaviors above can be identified easily according to their former reports with high error report probability. If the rating system removes reports of users having high former P f and/or P m , these malicious reports will not make sense when the system updates the rating of the service. Consequently, to achieve a continuous trick, malicious users need to manage their P c f and P c m to disguise themselves as trustful ones sometimes to make sure P f < 0.5 and P m < 0.5. So in our work, we analyze the peer prediction mechanism under the conditions of P f < 0.5 and P m < 0.5. Therefore, the condition of P f + P m < 1 in Lemma 1 is reasonable, and in this case, inequality y ′ ij (h) > y ij > y ′ ij (l) can be always satisfied.
3) Inferred opinion reports: Instead of reporting the private evaluation of the service quality S i or x i , user i sends his/her prior and posterior probability of belief that peer user j gives report x j = 1. We notice that both report x i and x j are not provided directly by the relative user. In basic private-prior peer prediction, user i only sends reports y ij and y ′ ij (s i ) to the cloud, according to which the DPC infers opinion report x i and publishes it to the social service-evaluation platform. Inferred opinion report x i is generated by the following rule: Remarks: According to Lemma 1, it holds that y ′ ij (h) > y ij > y ′ ij (l) when both user i and j satisfy P f a + P md < 1 and P f + P m < 1. In other words, when user i makes a highquality judgement of the service after experiencing it (S i = h), inequality y ′ ij (h) > y ij always holds. Then applying (8), the DPC infers the opinion report as x i = 1 because y ′ ij > y ij . So this inferred report x i = 1 is consistent with user i's real judgement S i = h. Symmetrically, when S i = l, (8) can also derive the truthful opinion report x i = 0. Therefore, the rule formulated by (8) can truthfully reflect the judgement when the user is honest, under the conditions of P f a + P md < 1 and P f + P m < 1.

4) User trustworthiness:
Based on reports y ij and y ′ ij (s i ), the DPC calculates user i's trustworthiness through a certain scoring rule. Users with low trustworthiness are classified as the malicious, and their reports will be unconsidered in the service rating system. Next, we first introduce the strictly proper scoring rule, which can motivate users to provide truthful reports y ij and y ′ ij (s i ). The strictly proper scoring rule can be defined as Definition 1. [35]: A binary scoring rule is proper if it leads to an agent maximizing his/her score by truthfully providing his/her report y ∈ [0, 1], and is strictly proper if an agent can maximize his/her score if and only if providing his/her report truthfully.

Definition 1. Strictly Proper Scoring Rule
The binary logarithmic and quadratic scoring rules shown as (9) and (10), respectively, are strictly proper, which has been proved in [32].
1) The binary logarithmic scoring rule: 2) The binary quadratic scoring rule: In (9) and (10), ω ∈ {0, 1} indicates a binary report. We define the trustworthiness of user i as a function of y ij , y ′ ij and x j : where R (y, ω) is a strictly proper scoring rule, α ∈ [0, 1] is the parameter weighting the importance of the prior and posterior belief. In addition, the trustworthiness will be cumulative as the service and scoring process continues. A negative trustworthiness can be a reflection of either monetary punishment or the limitation of report providing for the corresponding user, and the negative benefits will be transferred as positive benefits to the users as rewards for their honor and accurate reports. Therefore, to keep the budget balanced, β is given by In (11), y ij and y ′ ij are the reports from user i before and after he/she makes judgement S i = s i for the object service approach, respectively, and x j is the user j's implicit opinion report inferred by the DPC according to user j's reports.
In addition, according to the analysis above, one can notice that the trustworthiness of user i is determined on user j's inferred opinion report x j , user i's prior belief report y ij and posterior belief report y ′ ij . In other words, one user's trustworthiness is irrelevant to reports or inferred reports of the other users in the system. Therefore, the cooperative cheating of malicious users will have little effect on the evaluation of users' trustworthiness, which is defined by (11).

B. Incentive Compatibility
As proved in [35], prior belief report y ij and posterior belief report y ′ ij (s i ) given by user i are temporal separated, which results from that they happen before and after making judgement S i = s i . Therefore, y ij and y ′ ij (s i ) are independent and then we have where both α are still strictly proper [34].

1) Binary logarithmic scoring rule:
We first apply the binary logarithmic scoring rule. Let p 1 = P (x j = 1) and p 2 = P (x j = 1 |S i = s i ), and then we have Take the partial derivatives with respect to y ij and y ′ ij : Therefore we get the optimal values aŝ Then take the second partial derivatives with respect to y ij and y ′ ij , and let y ij =ŷ ij and y ′ ij =ŷ ′ ij , then we have Therefore, the maximum of E [T i ] can be achieved when y ij = p 1 and y ′ ij = p 2 , which means that user i can receive the maximum trustworthiness if and only if he/she reports both y ij and y ′ ij truthfully. 2) Binary quadratic scoring rule: Next, we employ the binary quadratic scoring rule shown as (10). Thus we have Take the partial derivatives with respect to y ij and y ′ ij , and set them to zero, we get the same optimal values as (16a) and (16b). Then take the second partial derivatives, the following inequality can be always satisfied.
Remarks: Noticing that ∂ 2 E [T i ]/∂y 2 ij < 0 and ∂ 2 E [T i ]/∂y ′ ij 2 < 0 will always be satisfied no matter whether the binary logarithmic or quadratic scoring rule is applied, the maximum of E [T i ] can be reached when satisfying both (16a) and (16b). In other words, user i can receive the maximum trustworthiness if and only if he/she provides both y ij and y ′ ij truthfully, as mentioned previously. Assume that the cooperative cheating exists, which means that malicious users can contact with each other and manage the malicious behaviour. According to Definition 1, user i will obtain a lower score by reporting untruthfully than truthfully when his/her peer user j is a malicious one. For example, user i experiences a high-quality service and it means that his/her honest reports satisfy y ′ ij > y ij . However, because of user j's dishonest implicit opinion x j = 0, user i will obtain a higher score if he/she gives a lower y ′ ij < y ij instead of reporting truthfully, according to the binary logarithmic or quadratic scoring rule formulated as (9b) and (10b), respectively. To make sure that the honest users are predominant even when the cooperative cheating happens in the social network, we assume that the number of malicious users is less than the half of the total. Based on this assumption, the users with accurate information reports and prediction reports will always receive higher trustworthiness in a long term; meanwhile, the malicious users will be punished by a loss of trustworthiness every time they announce dishonest reports resulting in cheating opinion reports.

IV. USER TRUSTWORTHINESS AND UNRELIABILITY BASED SERVICE RATING A. Unreliability of User Report
In private-prior peer prediction, all users are required to report their prior belief that their peer users will report a high evaluation for the service before experiencing the service y ij = P i (x j = 1). This report can be obtained by the past reports x j inferred by the DPC and published by the cloud, which means that past reports x j are accessible for i's other friends in the social network, the cloud and DPC. Therefore, it is difficult to fabricate information report y ij for malicious users. To achieve cheating, malicious user i needs to manage his/her information and prediction report according to (8), i.e., y ′ ij = y ij + ε (ε > 0) with probability P c m,i when the service quality is low (Q = l), and y ′ ij = y ij −ε with probability P c f,i when the service quality is high (Q = h). Meanwhile, malicious users have to set ε as small as possible to avoid being punished by much loss of score and trustworthiness when their peer users are honest ones. In addition, we can notice that the falsealarm report and missed-detection report do not only result from the wrong judgements of honest users, but also due to the dishonest users' cheating behaviours, according to (1) and (2). Both of the situations above are considered as unreliable behaviours which need to be identified and removed from the final service rating. Therefore, it is necessary to set a threshold to limit the minimum gap between y ij and y ′ ij . Next, we analyze the influence of false-alarm judgement and missed-detection judgement on the scoring. Taking the derivative of (6a) and (6b) both with respect to P f a,i and P md,i , we can calculate to get where Based on the previous as- So under both of situations Q = h and Q = l , the score of user i goes down with the increasing P f a,i and P md,i when user j reports truthfully, according to (9a)/(10a) and (9b)/(10b), respectively. In other words, for fixed P f , P j and P (Q = h), the honest users with high judgement accuracy will receive higher scores and trustworthiness, compared to those honest users with high judgement error rates and malicious users reporting their prediction inversely and conservatively to give wrong reports and minimize the loss of scores. In the service rating system, neither implicit opinion reports of malicious users nor honest users with low judgement accuracy should be considered. To identify the two kinds of unreliable behaviour, we define an unreliability index to indicate the unreliability of user i by his/her prior belief report y ij and posterior belief report y ′ ij as follows.
Remarks: In (22), the first situation y ′ ij < y ij indicates that the more report y ′ ij is closed to P {x j = 1 |Q = l } when the service quality is low and farther away from P {x j = 1 |Q = h } when the service quality is high, the more reliable y ′ ij is. Meanwhile, for y ′ ij > y ij , when report y ′ ij is closed to P {x j = 1 |Q = h } and far away from P {x j = 1 |Q = l }, this report can be considered reliable. In addition, according to (21), y ′ ij (l) increases with growing P f a,i and P md,i , and is more sensitive to P f a,i than P md,i ; y ′ ij (h) decreases with growing P f a,i and P md,i , and is more sensitive to P md,i than P f a,i . With assumption P f a,i , P md,i ∈ [0, 1], we can get that P m, the definition of unreliability shown in (22) can be rewritten as To calculate the unreliability of users' reports, the DPC needs to observe the report error rates P f and P m of each user based on the historical reports and service rating results. In addition, we assume that the service quality, denoted by P (Q = l) and P (Q = h), can also be obtained according to a long time scale and relatively stable historical rating results of services. Such assumptions are feasible and reasonable, considering that most current service-based application systems have the ability to provide such information. By utilizing (23), the users with high unreliability ρ are considered to be uncertainty ones who might be honest users with high error judgement rate or malicious users. Reports from these users are not reliable for the DPC to rate the quality of service. Consequently, the DPC needs to set a threshold ρ thr , and reports from the users with unreliability exceeding ρ thr will be removed from the service rating procedure. The threshold can be designed by the typical error rates of honest users with relatively high judgement accuracy.
Next, we describe the validity of the user unreliability defined in (23). Take situation Q = h for instance, malicious user i has to give the prediction report y ′ ij = y ij −ε < y ij to achieve cheating. In order to get a lower unreliability value below the threshold and make his/her cheating make sense in the service rating, user i needs to fabricate report y ′ ij to make it close to P m,j and away from 1 − P f,j . With conditions P f,i < 0.5 and P m,i < 0.5, the smaller y ′ ij is, the lower unreliability value will be get. On the other hand, the majority honest users trend to report the implicit opinion reports as x j = 1 when Q = h. According to (9a) and (10a), the score of user i decreases with reducing y ′ ij when his/her peer user j gives an accurate and honest report. Symmetrically, the dilemma still exists when Q = l. Therefore, it is difficult for malicious users to get high trustworthiness and low unreliability at the same time, if they report trickly. However, for those "bad functioning" honest users with relatively high error rate of judgement, the best choice is still reporting y ij and y ′ ij truthfully. It is unnecessary for them to modify their y ′ ij because their benefit is the score and trustworthiness determined by the information and prediction reports, and this benefit is irrelevant to that whether their reports are accepted by the DPC or not.

B. Peer Prediction Based Service Rating
According to the user's trustworthiness and unreliability analysis above, we design the private-prior peer prediction based service rating method as following procedures. 1) For every user i who accepts the service, choose another non-overlapped user j randomly among his/her friends as i's peer. 2) Ask user i for his/her prior belief report y ij ∈ [0, 1], i.e. his/her peer j will provide a report to the cloud that j evaluates the service as high-quality. 3) User i experiences the service and then makes his/her judgement S i = s i for the quality of the service. 4) Ask user i for his/her posterior belief report y ′ ij ∈ [0, 1] to the cloud, with y ′ ij ̸ = y ij , that his/her peer j will provide a report of receiving a high-quality service. 5) The DPC calculates the unreliability of every user by applying (23), and removes reports of users with ρ i > ρ thr from the service rating system. 6) The DPC infers the implicit opinion report x i of user i through (8), and calculates user i's trustworthiness according to (9)/(10) and (11) assisted by user j's inferred opinion report x j . Then remove reports of users with lowest trustworthiness from the service rating system. 7) The DPC makes the rating for the service by implicit opinion reports of users with both high trustworthiness and lower unreliability through (3).
V. SIMULATION RESULTS In this part, we perform numerical simulation experiments to analyze the properties and performances of the private-prior peer prediction service rating system and its influential factors such as the proportion of the malicious users and ε. First, we analyze the effect of the time accumulation on the trustworthiness and unreliability. In the peer prediction mechanism, we can notice that if the peer user of an honest user is a malicious one who decides to cheat when he/she reports to the cloud, the trustworthiness of the honest user trends to be low because of the strictly proper scoring rule. However, when malicious users are not predominant in the social network, which means that the proportion of the malicious users is less than half of the total, then honest users' accumulative trustworthiness will increase distinctly comparing with malicious ones in a long term.

A. Simulation Settings
The simulation for the service rating system is operated based on the topology of Flickr, a real-world online social network database. The Flickr topology contains 5,899,882 edges connecting 80,513 users, and the edge represents the friendship of the connected two users. In addition, this friendship of users in the Flickr network, also known as the topology, is determined by their favorites. In other words, the connection between any two users is established if the corresponding two users are sharing the common favorites and have followed the same community. Then such two users will be considered as a pair of peers. The topology of the Flickr network are depicted in Fig. 2. These users are separated into three types, i.e., reliable honest users with high judgement accuracy rate, malicious users with high judgement accuracy rate and high error report rate, and unreliable honest users with relatively high judgement error rate but always report truthfully. The three types of users exist with some certain percentage. We set that false alarm of judgement P f a and missed detection of judgement P md are uniform distribution variables, and for all reliable and malicious users P f a , P ma ∼ U [0.01, 0.02], and for unreliable users P f a , P ma ∼ U [0.05, 0.06]. In addition, as analyzed in the Remarks of Lemma 1, malicious users need to make sure that their false alarm and missed detection of report, P f and P m , are both smaller than 0.5 to achieve a continuous trick. Therefore, we set P f = P m = 0.3 (< 0.5) for malicious users in the following experiences. We assume that all honest users always report truthfully, i.e., P f = P m = 0.
• Historical database. To calculate the unreliability of each user, the DPC needs to obtain their historic error rates of report. So we first establish the report database by allowing each user judge the quality of the service independently and then report to the cloud all according to the type of the user. The process repeats 80 times and in each time, the probability of high service quality is set as P (Q = H) = 0.6. In addition, the quality of the service is determined through the majority rule shown as (3) by applying reports from all of the users.

B. Accumulative Trustworthiness And Unreliability
Then in the following experiences, the private-prior peer prediction method is introduced, and the peer of each user is updated in every new experience. Then new implicit opinion reports (inferred by y ij and y ′ ij ) and service rating results are added into the database and provide the historical data for the DPC. We consider that the trustworthiness and the unreliability of each user can be accumulated with the increasing service times. To calculate the trustworthiness, both of the scoring rules, i.e., binary logarithmic and binary quadratic, are applied. Simulation results of users' accumulative trustworthiness and unreliability in the following 200 times of service are shown in Fig. 3 and Fig. 4, in which the percentages of reliable honest user, malicious user and unreliable honest user are set as 40%, 40% and 20%, respectively. In both of the figures, we show the results of some sample users selected from the three types randomly. In Fig. 3, the trustworthiness of honest users might be negative at the beginning, when their peer users are the malicious. On the other hand, some malicious ones even obtain larger trustworthiness at the beginning, when their peers are also the malicious. However, resulting from the peer updating after each time of service, as well as the small proportion of the malicious, the predomination of honest users trends to work in a long term. Fig. 3 indicates that the accumulative trustworthiness of honest users grows with the service rating times or experience time. On the contrary, the accumulative trustworthiness of malicious users drops down and is negative. In addition, we can notice that no matter which scoring rule is applied, the accumulative trustworthiness shows the similar characteristics and tendency.
Similar results of accumulative unreliability are shown in Fig. 4, in which the gaps are more obvious among different types of users. Moreover, we can notice that unreliable honest users can be identified through the unreliability index, which cannot be achieved by the trustworthiness. This result demonstrates that the best choice for unreliable honest users is still reporting truthfully, and their unreliability will bring no hazard to their high positive trustworthiness.

C. Influence of ε, Scoring Rules And User Structure
In the basic private-prior peer prediction mechanism, the strictly proper scoring rule leads malicious users to fabricate minimum ε, i.e., y ′ ij = y ij + ε (ε > 0) when the quality of the service is low, and y ′ ij = y ij − ε when the quality is high. In the trustworthy service racing system, the unreliability index proposed brings the dilemma to malicious users when they set ε as discussed previously. Next, we test the influence of ε on the average trustworthiness and unreliability. Considering two cases of user structure, the percentages of reliable honest user, malicious user and unreliable honest user are set as 60%, 20% and 20% in one case, respectively, and in another case are set as 40%, 40% and 20%. We repeat the service rating experiments for 200 times, and then calculate the average trustworthiness and unreliability of each type of users in these 200 times experiments (not the accumulative trustworthiness or unreliability). Results in Fig. 5(a) and Fig. 5(b) present the average trustworthiness when applying binary logarithmic and binary quadratic scoring rules, respectively, when ε ∈ [0.1, 0.2]. In addition, Fig. 6 presents how the average unreliability changes when ε increases. As depicted in Fig. 5 and Fig. 6, both the trustworthiness and unreliability decrease with the increase of ε for malicious users, which demonstrates the incentive and identification capabilities when combining trustworthiness and unreliability together to evaluate users reports. On the other hand, the average trustworthiness and unreliability of honest users are not sensitive to changing ε. In addition, we can notice that, when the percentage of malicious users is small, the gaps between the trustworthiness and unreliability malicious and honest user tend to be wide, which will make it much easier to identify the malicious.
Removing unreliable reports and reports from users with low trustworthiness, we rate the service quality by trustful reports to improve the accuracy of rating. In this part, we define the service rating accuracy as the ratio of the number of selected correct reports to the number of all correct reports. In addition, the threshold of unreliability is set as an empirical value obtained from the training of historical database, to be specific, ρ thr = 5. Then we test the service rating accuracy over the proportion of malicious users, unreliable honest users' error rates of judgement and ε. Results shown in Fig. 7 indicate that the service rating accuracy decreases with the increasing proportion of malicious users. When this proportion   Fig. 7. The service racing accuracy versus the the percentage of each user type, error rates of judgement P f a , P md and ε. the rating accuracy is higher when unreliable honest users' P f a , P ma ∼ U [0.35, 0.45] than P f a , P ma ∼ U [0.1, 0.2], which results from that honest users with higher judgement error rates can be identified more easily by applying the unreliability index. Fig. 7 also indicates that the lower ε malicious users set, the harder they can be detected through the trustworthiness.

VI. CONCLUSION
In this paper, we proposed an cloud based architecture for the service rating system. To achieve a trustworthy service rating, a private-prior peer prediction based mechanism was designed to identify malicious and dishonest users. Coupled with some certain strictly proper scoring rules, the peer prediction method can evaluate users' trustworthiness and motivate them to report honestly. Moreover, an unreliability index was also designed to ensure the reliability of the users' reports. According to the trustworthiness and unreliability index, untruthful and unreliable reports can be identified and eliminated to improve the accuracy of service rating. Simulation results indicated that the proposed peer prediction based trustworthy service rating system can identify malicious and unreliable behaviours effectively, and achieve relatively high service rating accuracy.
Therefore, when P f a,i + P md,i < 1 and P f,j + P m,j < 1, inequality y ′ ij (h) > y ij holds. By symmetry, we have y ′ ij (l) < y ij under the same conditions. This completes the proof of Lemma 1.