
Estimating Component Availability by DempsterShafer Belief Networks Lan Guo LDCSEE, West Virginia University, Morgantown, WV 265066109 lan@csee.wvu.edu
DempsterShafer (DS) Belief Network is a complete formalism of evidential reasoning for computing and propagating evidential (confirming or disconfirming) support throughout the network. The DS evidential representation and inference scheme is a more general and robust theory than the Bayesian theory [1]. In this paper, we propose a novel methodology to induce the DS belief networks. Our method is validated by an empirical example of component availability estimation.
We use prediction logic based on a contingency table of probabilities [3] to induce the belief network. The drawback of the induction algorithm proposed by Liu et al. [2] is its dramatic dependence on the sample size. Additionally, their algorithm violates the assumption of the binomial distribution that the sample size must be constant. In [2], the sample sizes used for deriving modus ponens and modus tollens are random, only the total size of the contingency table is constant. Therefore, using binomial distribution for implication induction is improper and leads to erroneous results. We used a modified Optimality method [3] to derive the proposition between two events as shown in Figure 1. B egin Set a significance level _{min} and a minimal U_{min} For node_{p}_{, }p [0, n_{max} – 1] and node_{q}, q [p + 1, n_{max}] (Note: n_{max }is the total number of nodes) For all empirical case samples N N_{11} N_{12} N_{21} N_{22} Compute a contingency table M_{pq} = For each relation type k find the solution to Max U_{p} Subject to Max U_{p} > U_{min} _{p} _{min} _{ij} = 1 or 0 (if N_{ij }corresponds to an error cell, _{ij} = 1; otherwise, _{ij} = 0) ^{(b)} > ^{(b’)} if ^{(b)} = 1 and ^{(b’)} = 0 If the solution exists, then return a type k relation EndFigure 1. The Implication Induction Algorithm For a single error cell, if N_{ij} is the number of error occurrences, we have: U_{p} = U_{ij} = _{p} = _{ij }= 1 For multiple error cells, U_{p} = *U_{ij} (_{ij} = 1 for error cells; otherwise, _{ij} = 0) _{p} = _{ij} In our algorithm, the logical equivalent relations are derived only once and carry different weight. We use a quintuple to represent each implication rule: IЇ, I= Where W_{I} and W_{I}’ are weight functions mapping the antecedentconsequent nodes, i.e., N_{ant} and N_{con, }for the relation type R: W_{I}: N_{ant} x N_{con} [0, 1] W_{I}’: N_{con} x N_{ant}_{ } [0, 1]
In the DS belief networks, the set of all possible outcomes of a node is called the frame of discernment, , which must be exhaustive and disjoint. The DS theory allows a basic probability assignment to the subsets of a conclusion, which satisfies: m: 2^{} [0, 1], m() = 0, and. When evidence about a certain node arrives, the beliefs of this node can be updated by Dempster’s rule of combination [1]. For = {a, a}, Bel(a) = m’(a) = Bel(a) = m’(a) = Due to the node connectivity, the updated belief can be propagated throughout the network by the algorithm stated in [2].
In this experiment, the data set was generated from the Bayesian network used for estimating component availability [4] in a large distributed network. There are 1,100 model components called network access devices (NAD) in the whole network. There are six basic types of failures that occur everyday: power failures, reset button failures, address failures, bus failures, configuration failures, and other failures. Repeated failures are grouped as software error, configuration error, and operational error. Component availability is linked to the corresponding failure nodes. Based on the node probability tables associated with the Bayesian network, we generated two sets of data samples: one for constructing the DS network with 1000 data points, and the other for validating the inference scheme with 100 data points. There are only two states for each node in the network. For the failure type nodes, the two states are low failure occurrences (represented by 1) and high failure occurrences (represented by 0) with the corresponding failure number ranges. For availability node, the two states are available (represented by 1) and unavailable (represented by 0). Next, we applied the implication induction algorithm, described in Section 2, to induce the implication relationships between pairs of nodes based on the data simulated. This step builds the DS belief network automatically from the data. For the testing sample, we randomly selected an unobserved node and used its value as the new evidence and propagated the updated belief values to the other nodes reachable from the observed one. For each of the unobserved nodes, we compared the belief value predicted and the value in the testing sample, and output the difference by the evaluation metrics computed (see below). We continued step 4 and 5 until all nodes were observed. The absolute difference between the actual value in the testing sample and the computed belief value is defined as: _{X} =  Bel_{emp}(X) – Bel_{est}(X) The evaluation metrics are the mean error and the standard error of estimate defined as: = Where n_{max} is the number of nodes in the network, N_{S} is the number of simulated testing samples. In this case, n_{max} =10 and N_{S} = 100. The implication method performance of the constructed DS network was compared with the performance when no inference propagation was performed (where the frequency of event occurrences are used as beliefs) in Figure 2. Figure 2. The mean estimate error in two different modes of observation
This paper presented a novel, efficient, and dynamic means to automatically construct DS belief networks. The method is free of subjective biases, which is an important advantage over the Bayesian belief networks. In contrast to the implication induction algorithm by Liu et al. [2], which gives erroneous results, our induction algorithm is a sound prediction method. The validity of the DS belief network inducted and the implication inference algorithms is demonstrated in the experiment. The prediction error is greatly reduced by the implication method over the DS network compared to the performance without inference propagation. The predicted component availability of 67.8% is much closer to the observations from the 100 testing samples (74.0%). For the same data set, traditional probability solution offers 89.3% availability and the Bayesian network prediction is 89.4% [4]. This study is the first attempt to utilize the DS belief network for software reliability engineering. We believe it is a promising methodology. This general method could also be applied in other areas. Our future work includes employing the entropy notion for optimal inference for greater prediction accuracy over the whole network. References
