Hierarchical DSmP transformation for decision-making under uncertainty

Dempster-Shafer evidence theory is widely used for approximate reasoning under uncertainty; however, the decision-making is more intuitive and easy to justify when made in the probabilistic context. Thus the transformation to approximate a belief function into a probability measure is crucial and important for decision-making based on evidence theory framework. In this paper we present a new transformation of any general basic belief assignment (bba) into a Bayesian belief assignment (or subjective probability measure) based on new proportional and hierarchical principle of uncertainty reduction. Some examples are provided to show the rationality and efficiency of our proposed probability transformation approach.


I. INTRODUCTION
Dempster-Shafer evidence theory (DST) [1] proposes a mathematical framework for approximate reasoning under uncertainty thanks to belief functions.Thus it is widely used in many fields of information fusion.As any theory, DST is not exempt of drawbacks and limitations, like its inconsistency with the probability calculus, its complexity and the miss of a clear decision-making process.Aside these weaknesses, the use of belief functions remains flexible and appealing for modeling and dealing with uncertain and imprecise information.That is why several modified models and rules of combination of belief functions were proposed to resolve some of the drawbacks of the original DST.Among the advances in belief function theories, one can underline the transferable belief model (TBM) [2] proposed by Smets, and more recently the DSmT [3] proposed by Dezert and Smarandache.
The ultimate goal of approximate reasoning under uncertainty is usually the decision-making.Although the decisionmaking can be done based on evidence expressed by a belief function [4], the decision-making is better established in a probabilistic context: decisions can be evaluated by assessing their ability to provide a winning strategy on the long run in a game theory context, or by maximizing return in a utility theory framework.Thus to take a decision, it is usually preferred to transform (approximate) a belief function into a probability measure.So the quality of such probability transformation is crucial for the decision-making in the evidence theory.The research on probability transformation has attracted more attention in recent years.
The classical probability transformation in evidence theory is the pignistic probability transformation (PPT) [2] in TBM.TBM has two levels: the credal level, and the pignistic level.Beliefs are entertained, combined and updated at the credal level while the decision making is done at the pignistic level.PPT maps the beliefs defined on subsets to the probability defined on singletons.In PPT, belief assignments for a compound focal element are equally assigned to the singletons included.In fact, PPT is designed according to the principle of minimal commitment, which is somehow related with uncertainty maximization.
Other researchers also proposed some modified probability transformation approaches [5]- [13] to assign the belief assignments of compound focal elements to the singletons according to some ratio constructed based on some available information.The representative transformations include Sudano's probability transformations [8] and Cuzzolin's intersection probability transformation [13], etc.In the framework of DSmT, another probability transformation approach was proposed, which is called DSmP [9].DSmP takes into account both the values of the masses and the cardinality of focal elements in the proportional redistribution process.DSmP can also be used in both DSmT and DST.For a probability transformation, it is always evaluated by using probabilistic information content (PIC) [5] (PIC being the dual form of Shannon entropy), although it is not enough or comprehensive [14].A probability transformation providing a high probabilistic information content (PIC) is preferred in fact for decision-making since naturally it is always easier to take a decision when the uncertainty is reduced.
In this paper we propose a new probability transformation, which can output a probability with high but not exaggerated PIC.The new approach, called HDSmP (standing for Hierarchical DSmP) is implemented hierarchically and it fully utilize the information provided by a given belief function.Succinctly, for a frame of discernment (FOD) with size , for  =  down to  = 2, the following step is repeated: the belief assignment of a focal element with size  is proportionally redistributed to the focal elements with size  − 1.The proportion is defined by the ratio among mass assignments of focal elements with size  − 1.A parameter  is introduced in the formulas to avoid division by zero and warranty numerical robustness of the result.HDSmP corresponds to the last step of the hierarchical proportional redistribution method for basic belief assignment (bba) approximation presented briefly in [16] and in more details in [17].Some examples are given at the end of this paper to illustrate our proposed new probability transformation approach.Comparisons of our new HDSmP approach with the other well-known approaches with related analyses are also provided.

A. Brief introduction of evidence theory
In Dempster-Shafer theory [1], the elements in the frame of discernment (FOD) Θ are mutually exclusive.Suppose that 2 Θ represents the powerset of FOD, and one defines the function  : 2 Θ → [0, 1] as the basic belief assignment (bba), also called mass function satisfying: Belief function () and plausibility function ( ) are defined below, respectively: Suppose that  1 ,  2 , ...,   are  mass functions, Dempster's rule of combination is defined in (4): Dempster's rule of combination is used in DST to accomplish the fusion of bodies of evidence (BOEs).However, the final goal for decision-level information fusion is decision making.The beliefs should be transformed into probabilities, before the probability-based decision-making.Although there are also some research works on making decision directly based on belief function or bba [4], probability-based decision methods are more intuitive and have become the current trend to decide under uncertainty from approximate reasoning theories [15].Some existing and well-known probability transformation approaches are briefly reviewed in the next section.

B. Probability transformations used in DST framework
A probability transformation (or briefly a "probabilization") is a mapping   :  Θ →  Θ , where  Θ means the belief function defined on Θ and  Θ represents a probability measure (in fact a probability mass function, pmf) defined on Θ.Thus the probability transformation assigns a Bayesian belief function (i.e.probability measure) to any general (i.e.non-Bayesian) belief function.It is a reason why the transformations from belief functions to probability distributions are sometimes called also Bayesian transformations.
The major probability transformation approaches used so far are: a) Pignistic transformation The classical pignistic probability was proposed by Smets [2] in his TBM framework which is a subjective and a nonprobabilistic interpretation of DST.It extends the evidence theory to the open-world propositions and it has a range of tools including discounting and conditioning to handle belief functions.At the credal level of TBM, beliefs are entertained, combined and updated.While at the pignistic level, beliefs are used to make decisions by resorting to pignistic probability transformation (PPT).The pignistic probability obtained is always called betting commitment probability (in short, BetP).The basic idea of pignistic transformation consists of transferring the positive belief of each compound (or nonspecific) element onto the singletons involved in that compound element split by the cardinality of the proposition when working with normalized bba's.
Suppose that Θ = { 1 ,  2 , ...,   } is the FOD.The PPT for the singletons is defined as [2]: PPT is designed according to an idea similar to uncertainty maximization.In PPT, masses are not assigned discriminately to different singletons involved.For information fusion, the aim is to reduce the degree of uncertainty and to gain a more consolidated and reliable decision result.High uncertainty in PPT might not be helpful for the decision.To overcome this, some typical modified probability transformation approaches were proposed which are summarized below.
b) Sudano's probabilities Sudano [8] proposed Probability transformation proportional to Plausibilities (PrPl), Probability transformation proportional to Beliefs (PrBel), Probability transformation proportional to the normalized Plausibilities (PrNPl), Probability transformation proportional to all Plausibilities (PraPl) and Hybrid Probability transformation (PrHyb), respectively.As suggested by their names, different kinds of mappings were used.For the belief function defined on the FOD Θ = { 1 , ...,   }, they are respectively defined by Advances and Applications of DSmT for Information Fusion.Collected Works.Volume 4 c) Cuzzolin's intersection probability From a geometric interpretation of Dempster's rule of combination, an intersection probability measure was proposed by Cuzzolin [12] from the proportional repartition of the total nonspecific mass (TNSM) for each contribution of the nonspecific masses involved.
⋅ TNSM (11) where d) DSmP transformation DSmP proposed recently by Dezert and Smarandache is defined as follows: In DSmP, both the mass assignments and the cardinality of focal elements are used in the proportional redistribution process.The parameter of  is used to adjust the effect of focal element's cardinality in the proportional redistribution, and to make DSmP defined and computable when encountering zero masses.DSmP made an improvement compared with Sudano's, Cuzzolin's and PPT formulas, in that DSmP mathematically makes a more judicious redistribution of the ignorance masses to the singletons involved and thus increases the PIC level of the resulting approximation.Moreover, DSmP works for both theories of DST and DSmT.
There are still some other definitions on modified PPT such as the iterative and self-consistent approach PrScP proposed by Sudano in [5], and a modified PrScP in [11].Although the aforementioned probability transformation approaches are different, they are all evaluated according to the degree of uncertainty.The classical evaluation criteria for a probability transformation are the following ones: 1) Normalized Shannon Entropy Suppose that  () is a probability mass function (pmf), where  ∈ Θ, |Θ| =  and the |Θ| represents the cardinality of the FOD Θ.An evaluation criterion for the pmf obtained from different probability transformation is as follows [12]: i.e., the ratio of Shannon entropy and the maximum of Shannon entropy for { ()| ∈ Θ},|Θ| =  .Clearly E H is normalized.The larger E H is, the larger the degree of uncertainty is.The smaller E H is, the smaller the degree of uncertainty is.When E H = 0, one hypothesis will have probability 1 and the rest with zero probabilities.Therefore the agent or system can make decision without error.When E H = 1, it is impossible to make a correct decision, because  (), for all  ∈ Θ are equal.
2) Probabilistic Information Content Probabilistic Information Content (PIC) criterion [5] is an essential measure in any threshold-driven automated decision system.The PIC value of a pmf obtained from a probability transformation indicates the level of the total knowledge one has to draw a correct decision.
Obviously, PIC = 1 − E H .The PIC is the dual of the normalized Shannon entropy.A PIC value of zero indicates that the knowledge to take a correct decision does not exist (all hypotheses have equal probabilities, i.e., one has the maximal entropy).
Less uncertainty means that the corresponding probability transformation result is better to help to take a decision.According to such a simple and basic idea, the probability transformation approach should attempt to enlarge the belief differences among all the propositions and thus to achieve a more reliable decision result.

III. THE HIERARCHICAL DSMP TRANSFORMATION
In this paper, we propose a novel probability transformation approach called hierarchical DSmP (HDSmP), which provides a new way to reduce step by step the mass committed to uncertainties until to obtain an approximate measure of subjective probability, i.e. a so-called Bayesian bba in [1].It must be noticed that this procedure can be stopped at any step in the process and thus it allows to reduce the number of focal elements in a given bba in a consistent manner to diminish the size of the core of a bba and thus reduce the complexity (if needed) when applying also some complex rules of combinations.We present here the general principle of hierarchical and proportional reduction of uncertainties in order to finally obtain a Bayesian bba.The principle of redistribution of uncertainty to more specific elements of the core at any given step of the process follows the proportional redistribution already proposed in the (non hierarchical) DSmP transformation proposed recently in [3].
Let's first introduce two new notations for convenience and for concision: 1) Any element of cardinality 1 ≤  ≤  of the power set 2 Θ will be denoted, by convention, by the generic notation ().For example, if Θ = {, , }, then (2) denotes the following partial uncertainties  ∪ ,  ∪  or  ∪ , and (3) denotes the total uncertainty  ∪  ∪ .
2) The proportional redistribution factor (ratio) of width  involving elements  and  of the powerset is defined as (for  ∕ = ∅ and  ∕ = ∅) where  is a small positive number introduced here to deal with particular cases where In HDSmP, we just need to use the proportional redistribution factors of width  = 1, and so we will just denote (, ) ≜  1 (, ) by convention.The HDSmP transformation is obtained by a step by step (recursive) proportional redistribution of the mass (()) of a given uncertainty () (partial or total) of cardinality 2 ≤  ≤  to all the least specific elements of cardinality  − 1, i.e. to all possible ( − 1), until  = 2 is reached.The proportional redistribution is done from the masses of belief committed to ( − 1) as done classically in DSmP transformation.Mathematically, HDSmP is defined for any (1) ∈ Θ, i.e. any   ∈ Θ by where the "hierarchical" masses  ℎ (.) are recursively (backward) computed as follows: Actually, it is worth to note that () is in fact unique and it corresponds only to the full ignorance  1 ∪  2 ∪ . . .∪   .Therefore, the expression of  ℎ (( − 1)) in Eq. ( 18) just simplifies as

IV. EXAMPLES
In this section we show in details how HDSmP can be applied on very simple different examples.So let's examine the three following examples based on a simple 3D frame of discernment Θ = { 1 ,  2 ,  3 } satisfying Shafer's model.

A. Example 1
Let's consider the following bba: ( 1 ) = 0.10, ( 2 ) = 0.17, ( 3 ) = 0.03, We apply HDSmP with  = 0 in this example because there is no mass of belief equal to zero.It can be verified that the result obtained with a small positive  parameter remains (as expected) numerically very close to the result obtained with  = 0.This verification is left to the reader.The first step of HDSmP consists in redistributing back ( 1 ∪  2 ∪  3 ) = 0.30 committed to the full ignorance to the elements  1 ∪  2 ,  1 ∪  3 and  2 ∪  3 only, because these elements are the only elements of cardinality 2 that are included in  1 ∪ 2 ∪ 3 .Applying the Eq. ( 18) with  = 3, one gets when (2) =  1 ∪  2 ,  1 ∪  3 and  1 ∪  2 the following masses Now, we go to the next step of HDSmP and one needs to redistribute the masses of partial ignorances (2) corresponding to  1 ∪ 2 ,  1 ∪ 3 and  2 ∪ 3 back to the singleton elements (1) corresponding to  1 ,  2 and  3 .We use directly HDSmP in Eq. ( 17) for doing this as follows: The procedure can be illustrated in Fig. 1 below.
Step 2  The classical DSmP transformation [3] and the other transformations (BetP [2], PrBel and PrPl [8]) are compared with HDSmP for this example in Table I.It can be seen in Table I that the normalized entropy E H of HDSmP is relatively small but not too small among all the probability transformations used.In fact it is normal that the entropy drawn form HDSmP is a bit bigger than the entropy drawn from DSmP, because there is a "dilution" of uncertainty in the step-by-step redistribution, whereas such dilution of uncertainty is absent in the direct DSmP transformation.

C. Example 3
Let's consider the following bba: In this example, the mass assignments for all the focal elements with cardinality size 2 equal to zero.For HDSmP, when  > 0, ( 2 ∪  3 ) will be divided equally and redistributed to { 1 ∪  2 }, { 1 ∪  3 } and { 2 ∪  3 }.Because the ratios are One sees that with the parameter  = 0, HDSmP cannot be computed (division by zero) and that is why it is necessary to use  > 0 in such particular case.The results of HDSmP and other probability transformations are listed in Table III.It can be seen in Table III that the normalized entropy E H of HDSmP is relatively small but not the smallest among all the probability transformations used.Naturally, and as already pointed out, HDSmP =0 cannot be computed in such example because of division by zero.But with the use of the parameter  = 0.001, the mass of ( 1 ∪ 2 ∪ 3 ) becomes equally divided and redistributed to the focal elements with cardinality of 2. This justify the necessity of the use of parameter  > 0 in some particular cases when there exist masses equal to zero.

D. Example 4 (vacuous bba)
Let's consider the following particular bba, called the vacuous bba since it represents a fully ignorant source of evidence: In this example, the mass assignments for all the focal elements with cardinality less than 3 equal to zero.For HDSmP, when  > 0, ( 1 ∪  2 ∪  3 ) will be divided equally and redistributed to { 1 ∪  2 }, { 1 ∪  3 } and { 2 ∪  3 }.Similarly, the mass assignments for focal elements with cardinality of 2 (partial ignorances) obtained at the intermediate step will be divided equally and redistributed to singletons included in them.This redistribution is possible for the existence of  > 0 in HDSmP formulas.HDSmP cannot be applied and computed in such example if one takes  = 0, and that is why one needs to  > 0 here.The results of HDSmP and other probability transformations are listed in Table IV.It can be seen in Tables I -IV that the normalized entropy E H of HDSmP is always moderate among the other probability transformations it is compared with, and it is normal to get an entropy value with HDSmP bigger than with DSmP because of dilution of uncertainty through the procedure of HDSmP.We have already shown that the entropy criteria is not enough in fact to evaluate the quality a probability transformation [14], and always a compromise must be found between entropy level and numerical robustness of the transformation.Although the entropy should be as small as possible for decision-making, exaggerate small entropy is not always preferred.Because of the way the mass of (partial) ignorances is proportionally redistributed, it is clear that if the mass assignment for a singleton equals to zero in the original bba, then after applying DSmP or HDSmP transformations this mass is unchanged and is kept to zero.This behavior may appear a bit intuitively surprising at the first glance specially if some masses of partial ignorances including this singleton are not equal to zero.
behavior is however normal in the spirit of proportional redistribution because one wants to reduce the PIC value so that if one has no strong support (belief) in a singleton in the original bba, we expect also to have no strong support in this singleton after the transformation is applied which makes perfectly sense.Of course if such behavior is considered as too optimistic or not acceptable because it appears too risky in some applications, it is always possible to choose another transformation instead.The final choice is always left in the hands of the user, or the fusion system designer.

V. CONCLUSIONS
Probability transformation is very crucial for decisionmaking in evidence theory.In this paper a novel interesting and useful hierarchical probability transformation approach called HDSmP has been proposed, and HDSmP always provides a moderate value of entropy which is necessary for an easier and reliable decision-making support.Unfortunately the PIC (or entropy) level is not the unique useful criterion to evaluate the quality of a probability transformation in general.At least the numerical robustness of the method is also important and must be considered seriously as already shown in our previous works.Therefore, to evaluate any probability transformation more efficiently and to outperform existing transformations (including DSmP and HDSmP) a more general comprehensive evaluation criteria need to be found.The search for such a criteria is under investigations.

Table I EXPERIMENTAL
RESULTS FOR EXAMPLE 1.

Table IV EXPERIMENTAL
RESULTS FOREXAMPLE 4.