Conditioning, updating and lower probability zeroby Jasper De Bock, Gert de Cooman

International Journal of Approximate Reasoning


Theoretical Computer Science / Software / Applied Mathematics / Artificial Intelligence


International Journal of Approximate Reasoning 67 (2015) 1?36

Contents lists available at ScienceDirect



Gh a












Se 1. ap or of w pr m po m be * ht 08International Journal of Approximate Reasoning onditioning, updating and lower probability zero sper De Bock ?, Gert de Cooman ent University, SYSTeMS Research Group, Belgium r t i c l e i n f o a b s t r a c t ticle history: ceived 10 December 2014 ceived in revised form 18 August 2015 cepted 24 August 2015 ailable online 1 September 2015 ywords: nditioning dating obability zero gular extension tural extension ts of desirable gambles

We discuss the issue of conditioning on events with probability zero within an impreciseprobabilistic setting, where it may happen that the conditioning event has lower probability zero, but positive upper probability. In this situation, two different conditioning rules are commonly used: regular extension and natural extension. We explain the difference between them and discuss various technical and computational aspects. Both conditioning rules are often used to update an imprecise belief model after receiving the information that some event O has occurred, simply by conditioning on O , but often little argumentation is given as to why such an approach would make sense. We help to address this problem by providing a firm foundational justification for the use of natural and regular extension as updating rules. Our results are presented in three different, closely related frameworks: sets of desirable gambles, lower previsions, and sets of probabilities.

What makes our justification especially powerful is that it avoids making some of the unnecessary strong assumptions that are traditionally adopted. For example, we do not assume that lower and upper probabilities provide bounds on some ?true? probability mass function, on which we can then simply apply Bayes?s rule. Instead a subject?s lower probability for an event O is taken to be the supremum betting rate at which he is willing to bet on O , and his upper probability is the infimum betting rate at which he is willing to take bets on O ; we do not assume the existence of a fair betting rate that lies in between these bounds. ? 2015 Elsevier Inc. All rights reserved.


Conditioning on events with probability zero is traditionally considered to be problematic because Bayes?s rule cannot be plied. In those cases, depending on the approach that is taken, the conditional probability measure is either left undefined chosen freely without enforcing any connection with the unconditional measure. The latter approach has the advantage being more flexible, but it does not really solve the problem, because it provides no guidelines on how to come up ith such a conditional model. In order to avoid all issues with countable additivity and measurability?which make the oblem even more complicated?we restrict attention to finite state spaces, thereby allowing us to work with probability ass functions. In that case, the most common solution to this problem is to simply ignore it, because from a practical int of view, in a finite state space, events with probability zero are usually considered to be impossible anyway, which akes the task of conditioning on them rather irrelevant.

The situation becomes more complex when we consider a set of probability mass functions instead of a single one, cause then, the lower and upper probability of an event need not coincide, and it may have lower probability zero, but

Corresponding author.

E-mail addresses: (J. De Bock), (G. de Cooman). tp:// 88-613X/? 2015 Elsevier Inc. All rights reserved. 2 J. De Bock, G. de Cooman / International Journal of Approximate Reasoning 67 (2015) 1?36 po pr be im co se co pr tiv

Th so im di m


Ex f (? fu of p( le eq th pr ha is

H to re ex va no w ol m it in in us ab of lim w w w ca ex th a sitive upper probability. This happens frequently in practice. For example, many common methods for deriving sets of obabilities from data will assign lower probability zero to an event for which there is no data?because it might well impossible?but not upper probability zero?because the fact that you have not seen it yet does not imply that it is possible. Clearly, such events cannot be ignored, and we need to be able to condition on them.

Fortunately, working with sets of probability mass functions is already part of the solution, because it allows us to ndition on an event O that has probability zero in a trivial manner, simply by taking the conditional model to be the t of all probability mass functions on O ; this is called the vacuous model on O . Since Bayes?s rule does not impose any nstraints, this vacuous model represents all we know about the conditional probability mass function on O .

If we start from a set M of mass functions, then three cases can be distinguished. If every mass function in M assigns obability zero to O , we use the vacuous model on O as our conditional model. If every element of M assigns posie probability to O , the conditional set of mass functions is obtained by applying Bayes?s rule to every element of M . e remaining possibility is the one that we have already mentioned: the case that O has probability zero according to me elements of M?lower probability zero?but positive probability according to others?positive upper probability. Here, precise-probabilistic frameworks?such as, but as we shall see not limited to, sets of probability mass functions?usually stinguish between two main conditioning rules: natural and regular extension. Natural extension again uses the vacuous odel on O , consisting of all probability mass functions on O . Regular extension ignores the probability mass functions in that assign probability zero to O and simply applies Bayes?s rule to the others. ample 1. Consider a ternary state space  = {a, b, c} and let f be the real-valued map on  that is defined by f (a) := 1, b) := 2 and f (c) := 3. Let M be the set of all mass functions p on  for which the corresponding expectation P p( f ) := ?? f (?)p(?) of f?also called the prevision of f?is higher than or equal to 2 or, equivalently, the largest set of mass nctions on  such that the lower expectation