### Basic Probability Paradox

This is actually a continuation of my frustrations with University assessment, masquerading as a mathematical paradox presentation. In actual fact I don't believe there is any paradox. Instead, I believe the University assessors are too interested in teaching people to regurgitate their gospel to worry about issues like the truth. Lets start with the problem description.

*The question*

**Two dice are rolled simultaneously. Given that one die shows a "4", what is the probability that the total on the uppermost faces of the two dice is "7"?**

I maintain that the answer is 1 in 6, and yet in a recent quiz I answered 2 in 11, because I knew that's what the lecturer believed. Yep, I actually swallowed my stubbornness, played their game and took the marks. Not mind you, before spending sometime one on one with the lecturer discussing the issue. The only thing I got out of that discussion was that there would be no discussion on the issue. As for the rest of university, the answer is what the lecturer says it is.

Now let me describe the reasoning behind the answers. Firstly, the lecturer's answer of 2 in 11:

*Justification of the lecturer's answer*

In throwing two dice we have a sample space of 36 outcomes. If one of the dice is a 4, that sample space is restricted to these 11 outcomes (written as first dice-second dice): 1-4, 2-4, 3-4, 4-4, 5-4, 6-4, 4-1, 4-2, 4-3, 4-5, 4-6. If we count the number of those outcomes that have 7 as the total, we find two favourable outcomes (3-4 and 4-3), giving us the probability of 2 in 11. Simple eh? Wrong.

*Paradox posed by lecturer's answer*

Say I have the two dice in my hand, and I drop one of them (I don't know which one, whichever falls out first). What are the chances it is a 5? 1 in 6 of course. Okay, say I don't make a guess at the first die, but I am told it is a 4. If I then drop the second die, what are the chances it is a 3? Same as dropping any other dice - 1 in 6 of course! A crucial fact in probability theory is that the probability of an event is not effected by history. Rolling a "4" on a fair die has always happened 1 in 6 times, regardless of how many "4"'s (or any such things) have been rolled in the past.

Okay, so lets look at the original question again. Two dice have been rolled. If I were to make a guess that the total of the two dice is 7, I would have a 1 in 6 chance of being right. I don't think there is any controversy here - there are 6 ways out of 36 that the total can be 7, which is equivalent to 1 in 6. Say just before making my guess, I'm given reliable information that one of the dice shows a "4". According to the lecturer, now that I have more information about the situation I can be more confident that the total is 7. That is, my odds of being right about the total being 7 has increased from 1 in 6 to 2 in 11.

Suppose for a moment that I was told one of the dice shows a "1". Again, according to the lecturer, my chances that the total will be 7 have increased to 2 in 11. What if I was told that one of the dice shows a "2". Still 2 in 11? What if I was told a "3", or a "5" or a "6". By the same logic, in each of these scenarios, armed with the information of what appears on one of the dice, I can bet my pretty pennies with extra confidence, because the chances that a total of 7 results have increased from 1 in 6 to 2 in 11. So knowing beforehand that a "1" appears, or a "2" or a "3" or a "4" or a "5" or a "6" appears to increase my chances of guessing the total of 7, compared to not knowing that a "1" or a "2" or a "3" or a "4" or a "5" or a "6" appears. Next time I'm rolling two dice then, all I need someone to do is let me know "there's a number from 1 to 6 appearing!" and my chances of guessing the 7 total are better. Couldn't I just tell myself that? Looks like a paradox doesn't it? Well it's only a paradox if you believe that the chances ever increased to 2 in 11. I maintain they did not. Rather than leave this proof as the contradiction already laid out (proofs by contradiction always leave me unsatisfied) I'll explain my reasoning as to why the chances of the 7 total are always 1 in 6.

*Justification of my answer*

Rolling two dice "simultaneously", as explicitly specified in the question, makes no difference to the dice then if they were rolled one after the other. Neither die knows the other one has been rolled. If I were to make a guess on the value of first die to stop rolling, I would have a 1 in 6 chance of getting it right. If I were to make a guess on the value of the second die to stop rolling, I would have a 1 in 6 chance of getting it right. As I mentioned before, the probabilities don't change over time nor due to events happening nearby. Say the first die to stop rolling is the green one and the second the blue one. If after the dice were rolled, I was told that the green one shows a 4, that doesn't change the fact that I have a 1 in 6 chance of guessing the value of the blue one. The dice are independent objects and their outcomes are determined completely without regard of each other.

In fact, even if the dice are not uniquely identifiable, and all I am told is that a "4" appears, I still have a 1 in 6 chance of guessing the value of the other dice. What do I care that the dice I am guessing is blue, or that it was rolled first or that it stopped rolling last? All I care is that a die has been rolled and I'm guessing it's value. But hold on, isn't a double less likely than any other pair? Doesn't that mean that I would be foolish to guess that the other die shows a "4" as well? Well actually, no. I would have just as much chance of being right if I picked 4 than if I picked 5 - that is, 1 chance in 6. The subtle point is that there are actually 5 outcomes which don't produce a double and 1 that does. A double 4 is only more special than a 4 and a 5 because we place special emphasis on it. The chances of a double are still 1 in 6, and a non-double 5 in 6, just as in the case of two unknown dice.

Returning to the question then, if I did not know the value of either die the chances that the total is 7 would be 1 in 6. The favourable outcomes are 1-6, 2-5, 3-4, 4-3, 5-2 and 6-1, that is, 6 favourable outcomes out of 36 possibilities. Look at the favourable outcomes - regardless of the value of the first die, there is only one value of the second die which produces a favourable outcome. Therefore, if I am informed of the value of one of the dice, regardless of which one and of which value, there is a 1 in 6 chance that the other die shows the value I want. So the probability that the total is 7, given the value of one of the dice, remains 1 in 6.

*Why the lecturer's answer is wrong*

Finally, I want to show where lecturer's assumptions went wrong in coming up with the answer of 2 in 11. Once the sample space is written out, with the 11 combinations of two dice where a "4" appears, and the favourable outcomes identified, it is quite a straightforward deduction to arrive at the answer of 2 in 11. I propose that the error was actually made in the construction of the sample space. The sample space of the two dice is the regular pattern of 36 combinations, each appearing once. In this problem however, it is known that one die shows a "4". Lets imagine how this might occur.

Obviously, there was an observer who provides the extra piece of information. Suppose that this observer made their observation by announcing the value of the die that first stops rolling. In doing so, they have specified a distinction between the dice. This is identical statistically, to following the yellow dice, or the bigger one, or the one that was rolled first. In any case, the decision about which die to call has been made before or regardless of the actual rolling event. In that case, the announcement that a "4" appears restricts our sample space not to 11 cases, but to 6. They are: 4-1, 4-2, 4-3, 4-4, 4-5, and 4-6, because a distinction has been made between the dice. Of course, here there is one favourable outcome (4-3) and the probability is simply 1 in 6.

Suppose instead, that no decision is made until the dice are rolled. In other words, there is no distinction between the dice and both dice are considered with equal interest, after they are rolled. Say then, our observer announces the value of the first die half the time and the second die the other half of the time. Say also, that our observer lets us know that a "4" appears. What are the possible outcomes that may have occurred? Well if a "4" and a "3" appeared, then we have a 50% chance that our observer announces the "4" (the other 50% of the time the observer would announce the "3"). The same thing applies to the 10 combinations 1-4, 4-1, 2-4, 4-2, 3-4, 4-3, 5-4, 4-5, 6-4 and 4-6. The exception is if a 4-4 was rolled. In that case, we are guaranteed that our observer will announce a "4". In other words, in the case of 4-4, we are twice as likely to have a "4" announced than if a 3-4 was rolled. When we draw the sample space, given the announcement of a "4" then, we must make sure 4-4 appears twice as many times as each of the other combinations. That is: 1-4, 4-1, 2-4, 4-2, 3-4, 4-3, 4-4, 4-4, 5-4, 4-5, 6-4 and 4-6. There's 12 combinations there, and 2 that give us the favourable outcome of totalling 7. 2 in 12 is 1 in 6, as I originally proposed.

*One last thing...*

As an epilogue, there is one question left unanswered - what question has the lecturer answered? There was a lot of merit in the process, and certainly in isolation it seems to make sense, so why doesn't the problem fit? Well I imagine the solution would fit if the following small but critical, extra criteria were adhered to: In each rolling of the two dice, if no "4" appeared, the situation is thrown out, disregarded, and the two dice are rolled again. If we are only asked for the probability of a 7 total once our observer gives us the go-ahead (that is, once at least one "4" appears), we do indeed limit our sample space to that used in the calculation of the 2 in 11 probability. I imagine in this case the answer of 2 in 11 would be accurate.

And there endth this insanely long discussion of a seemingly simple problem. Am I wrong? I'm very interested to hear where my reasoning has missed the boat.

## Comments

I feel so bad for you. You are absolutely correct in your analysis of this very poorly worded. The wording of the question leads us to infer that one of the dice fell within your view and was seen to be a four, and the other fell out of your view and is unknown. In this case, your answer is in fact 1/6 as you have made clear.

Your professor obviously wants you to answer "what are the odds of rolling a seven given that AT LEAST ONE of the dice shows a four." I say "obviously" only because the 1/6 answer is too easy, and the 2/11 tests you on conditional probability, which is probably what you were studying. You were so right to give them the answer they wanted. What you did is called "playing the game" and represents many schools, but by no means all. I'm so happy to have gone to a school where the professors actually knew their stuff, and in the rare instance that they made a mistake, they came clean.

In your case, the professor should have used the phrase "Given that at least one of the dice is a four" in his question, because those magic words, "at least one," are the industry standard way to make sure that everyone understands the special (and unusual) case your professor has in mind.

My pet peeve is to always describe the game fully in these questions. This is why you are still wondering what game your professor has in mind. The game is an unrealistic one: "You roll a pair of fair dice into an area where you cannot see the results. An impartial observer will examine the result and tell you if at least one of the dice shows a four. You roll, and the observer tells you yes, you have rolled at least one four. What are the odds that you have rolled a seven?

As you can see, this game is much more contrived than the simple game you have in mind, where one of the dice fell within your view. This is because the magic number "four" has to be decided before you roll the dice, while the more reasonable game you have in mind simply reveals one of the dice and leaves the other to chance.

It also explains your paradox as to why the probabilities don't add up. They are not mutually exclusive. It is possible to have "at least one four" and "at least one five" on the same roll.

Posted by: Anonymous | July 18, 2006 4:59 PM

"Two dice are rolled simultaneously. Given that one die shows a "4", what is the probability that the total on the uppermost faces of the two dice is "7"?"

Actually, the teacher/question *did* say "given that *one* die shows a 4." He did not say "at least one die shows a 4." That is your (the blog writer's) error - if the question had instead been "given that at least one die shows a 4,"you would have been correct with 1/6th.

So, you were wrong. However, if the professor did not explain why you were wrong (like I just did), then that is definitely bad on his part - it is integral that math teachers can explain their reasoning well.

Posted by: Anonymous | September 2, 2006 10:48 AM

From the blog writer: thanks for taking the time to write. There's certainly food for thought there.

However, your interpretation is not what the lecturer meant either! Remember that the lecturer's answer was "2/11". If you suppose the lecturer meant that only one 4 appears, then the sample space is restricted to just 10 possibilities (the original 11 minus the double 4) and the calculated probability would be 2/10 = 1/5. Now we have another reasonable answer, different from both the lecturer's (2/11) and my own (1/6)!

Given that your interpretation is quite valid, and leads to a different answer, I believe an important point to make here is that there is simply far too much ambiguity in the original question - when it comes to probability you must be quite explicit.

Posted by: LightYear | September 2, 2006 1:22 PM

I agree with all of the above. Basically the question is ambiguous.

The difference between real life and college is in real life you have to resolve the ambiguity too get the right answer, in college you have to know which answere is being sought.

I came accross a vaguely similar situation with a book question. You get 5 letters three are bills, one is not and you loose the last one. What is the probability the lost one was a bill. I answered 50:50 given you knew absolutely zero about the lost letter. The book then proceeded to use the bill / non bill probability distribution of the observed letters to estimate the likelihood it was a bill. You could argue for ever about whether the unseen bill is governed by the same statistics as the others. My parting shot was it was a crap qustion because it allowed the arguament to take place.

Best of luck

David Devoy, PhD,

Principal Design Engineer

Selex Sensors and Airborne Systems

Posted by: David Devoy | November 1, 2006 12:11 AM

Apologies about comments being broken. If this appears, all is fixed again! Comment away...

Posted by: LightYear | February 10, 2007 11:14 AM

"In that case, the announcement that a "4" appears restricts our sample space not to 11 cases, but to 6. They are: 4-1, 4-2, 4-3, 4-4, 4-5, and 4-6, because a distinction has been made between the dice."

Yes, but these six cases are *NOT* equally likely. For example, there are two ways to roll a 4-1 -- you can roll the 4 and then the 1, or you can roll the 1 and then the 4.

The probability of rolling a 4 first is (1/6) and the probability of rolling a 1 next is (1/6). So the probability of rolling 4 followed by 1 is 1/36.

Along the same lines, rolling a 1 first is (1/6) and a 4 second (1/6), so 1 followed by 4 is also 1/36.

So the probability of rolling a 4 and a 1 (where order does not matter) is 2/36.

However, the probability of rolling 4/4 is only 1/36, because you have to roll a 4 (1/6) followed by another 4 (1/6). That is the only way to get two 4s so the probability is 1/36.

So the probability of rolling something with a 4 in it is:

4-1, 2/36

4-2, 2/36

4-3, 2/36

4-4, 1/36

4-5, 2/36

4-6, 2/36

---------

11/36

And the probability that you rolled a 4 and a 3, when you know you rolled at least one 4 is:

(2/36) / (11/36) --> 2/11.

All of your other arguments can be proven faulty as well.

The only thing that is ambiguous about the question is if you rolled 'at least one 4' or 'exactly one 4'. However, in the case of the latter, I think the correct answer is 2/10. So, that ambiguity does not help your case anyway.

Posted by: Anonymous | February 10, 2007 12:28 PM

"For example, there are two ways to roll a 4-1 -- you can roll the 4 and then the 1, or you can roll the 1 and then the 4."

You've misunderstood my description. The "4-1" statement explicity specifies that the 4 comes first. Re-read my post (and the comments so far) and you'll see that all your points have been addressed. Making the same argument using a different denominator wont convince me of much ;)

Posted by: LightYear | February 10, 2007 7:11 PM

Two coins have been flipped. If I were to make a guess that they differ, I would have half a chance of being right. Say just before making my guess, I'm given reliable information that one of the coins is heads. According to the lecturer, now that I have more information about the situation I can be more confident that they differ. That is, my odds of being right about their differing has increased from 1 in 2 to 2 in 3.

Suppose for a moment that I was told one of the dice shows tails. Again, according to the lecturer, my chances that they differ have increased to 2 in 3. Just as if I was told that one of the dice shows heads. By the same logic, in both of these scenarios, armed with the information of what appears on one of the coins, I can bet my pretty pennies with extra confidence, because the chances that they differ have increased from 1 in 2 to 2 in 3. So knowing beforehand that a head appears, or a tail appears, appears to increase my chances of guessing they differ, compared to not knowing that a head or a tail appears. Next time I'm flipping two coins then, all I need someone to do is let me know "there's a head or a tail appearing!" and my chances of guessing they differ are better. Couldn't I just tell myself that? Looks like a paradox doesn't it? Well it's only a paradox if you believe that the chances ever increased to 2 in 3. I maintain they did not.

Posted by: fizzle | February 10, 2007 8:11 PM

It may be easier to understand this "paradox" if we reduce the number of sides of the die from six to two, and recast the problem as a common-sense one.

Imagine a family with two children. What is the probability that they are of different sex? It's clearly one half, counting boy-girl or girl-boy out of the four possible combinations. If I told you that at least one of the children was a girl, then that only eliminates the combination boy-boy and means that the probability that they are of different sex increases from one half to two-thirds. The same happens if I said at least one were a boy.

Posted by: Paul Carpenter | February 10, 2007 8:38 PM

I find conditional probability difficult to reason about. Sometimes it's easier to try. Simulating a million dice rolls repeatedly gives me a ratio of about 0.182 7's for each pair containing at least one 4. That's 2/11.

Posted by: Steve Goldman | February 11, 2007 2:08 AM

fizzle, I like the recast to a coin problem - the same argument now looks exceedingly obvious.

Paul, I actually addressed the family with two children problem here: http://heath.hrsoftworks.net/archives/000047.html You'll see that the conclusion still holds, but the reasons take a lot more shape.

Steve, your simulation is begging the question. Mikael Johanssons has done something similar, and you can read his results and my response on his blog.

PS. I just enabled HTML in the comments because it's hard to be precise without it.

Posted by: LightYear | February 11, 2007 12:38 PM

Sorry, but your professor was absolutely correct. The only ambiguity is if the question means "at least one dice shows a 4" (in which case the answer is 2/11) or "exactly one dice shows a 4" (in which case the answer is 1/5).

The misunderstanding is your interpretation of how someone would answer the question "Does [at least] one dice show a 4?". The person must consider

bothdice in answering this, not just one like in your solution.Next time I'm rolling two dice then, all I need someone to do is let me know "there's a number from 1 to 6 appearing!" and my chances of guessing the 7 total are better.No, only if you state a

specificnumber that will appear do the chances raise to 2/11. You could make this argument, for example, if you had two rigged dice that had sensors in them and whenever you rolled them at least one would always show a 4.Posted by: CuBr | February 11, 2007 4:54 PM

This sounds awfully similar to the infamous Monty Hall problem.

Incidentally, you say that "a distinction has been made between the dice" in saying that one of them is a 4. I believe this is not the case -- it is only specified that one of the dice is 4, not *which* one.

Posted by: jkao | February 11, 2007 5:07 PM

(addendum)

I think your error is unfamiliarity with the (admittedly bizzare) conventions of probability word problems. :p

Posted by: jkao | February 11, 2007 5:09 PM

Posted by: LightYear | February 11, 2007 5:15 PM

The statement "one die shows a 4" can be interpreted in a few ways, all of which yield different answers:

* "I see at least one 4 among the two dice" - answer: 2/11

* "I see exactly one 4 among the two dice" - answer: 1/5

* "The first of the two dice I rolled shows a 4" - answer: 1/6

Personally, I would have chosen the first one. I think the third is a somewhat nonstandard interpretation of the phrase.

fizzle, your argument is incorrect. It boils down to:

P(A|B) = P(A|C)

B union C = everything

therefore P(A|B) = P(A)

where A = coins differ

B = at least one heads

C = at least one tails

The problem is that the last line does not follow from the first two unless B and C are mutually exclusive events.

Posted by: Ari Nieh | February 11, 2007 5:16 PM

By the way, you're right about the Monty Hall problem. It in turn, is similar to the "women with two children" problem, which I've addressed and compared to this problem here: http://heath.hrsoftworks.net/archives/000047.html

Posted by: LightYear | February 11, 2007 5:20 PM

Wow.. way to misunderstand a simple question and make it complicated and waste hours on it. Sorry dude, but it's not such a deep concept. The question implies "at least one."

Posted by: nutbearer | February 11, 2007 6:21 PM

I think your understanding of statistics is fine, but your understanding of the word 'given' is flawed. Here is how I think you are supposed to read the problem:

"Two dice are rolled simultaneously."

That means that there are 36 equally likely outcomes.

"Given that one die shows a '4'"

Looking in some dictionaries, we see that given means:

- Specified; fixed:

- assumed as actual

- Granted as a supposition

In other words, "Consider only the cases where at least one of the dice is a 4". This phrase does *not* tell us which die is 4, just that at least one is. So, we can only prune our list down to 11 equally likely possibilities.

"what is the probability that the total on the uppermost faces of the two dice is '7'"?

Of those 11 cases we are considering, 2 of them add up to 7, so the answer is 2/11.

Your problem is that you imagined the observer in the wrong way. In your example, the observer looks at the thrown dice, picks one of them by some method (left-most, random, etc), and tells you the number on it (which happens to be 4). In that case, you know that the observer can only be seeing one of 6 things, 4-1, 4-2, 4-3, 4-4, 4-5, 4-6. So the best guess you can make is 1/6.

However, the problem tells you 'given that one die shows 4', aka, 'it is specified that one die equals 4'.

In this case, the observer is *not* picking a die and telling you the number on it. Instead, he is supervising the rolls, and making sure you only get to make a guess when at least one of the die comes up 4.

It is somewhat ambiguous as to whether the question means 'at least one die shows 4' or 'exactly one die shows 4'. But the answer is either 2/11 or 2/10.

To show that the answer is 1/6, you would need to show that your interpretation of the word 'given' in that context is valid.

Posted by: Anonymous | February 11, 2007 6:25 PM

Your argument using the idea that some observer chose at random which number to tell you about would work, but that's not what's being stated here. I think you're right that this makes things problematic. We should conditionalize on all the information we have here, which includes the fact that the observer told us something, and not just what the observer told us. Unfortunately, if we don't know what strategy the observer is using, then we don't know how to do this. For instance, if she was using the strategy to always announce the lower of the two dice, then the probability of 7 is 0! And if she was always announcing the larger, then it's 2/7. So given that we lack this information, we have to appeal to our prior distribution over strategies, which is never explicitly stated, so there is no one right answer to this question.

However, one argument you make is definitely flawed. You point out that things should work out the same whether the number announced was 1, 2, 3, 4, 5, or 6. This is correct. However, when you thereby infer that this must be the same as the unconditional probability you are making a mistake. This inference is only valid if the possibilities you consider are mutually exclusive as well as exhaustive - but in this case, the sets of information conditionalized on in the different cases are not exclusive. Unless doubles are rolled, two of these events happen, not just one.

In the case where the events form an exclusive, exhaustive partition of the entire space, this principle is called "conglomerability" by people working in this field, and is close to one called "reflection" that was introduced by Bas van Fraassen in his paper "Belief and the Will" in 1984, I think. These principles play an important role in my dissertation that I'm writing, on the proper formal account of conditional probability.

Posted by: Kenny Easwaran | February 11, 2007 8:02 PM

Let us say that you flip two coins and are given the information that a head appears. Would you bet a penny they differ?

Let us repeat this procedure but let the information be that a tail appears. Would you bet a penny they differ?

Now consider this procedure being repeated many times, sometimes the information given being that a head appears and other times that a tail appears. If you bet each time according to the reasoning you used in the above two case would you expect to win money?

Posted by: fizzle | February 11, 2007 8:25 PM

This is an elementary blunder.

We are

nottwice as likely to have a 4 announced on a roll of 4-4. There is onlyoneway to roll 4-4, not two, so it appears only once in the sample space of rolling two dice. This gives two favorable outcomes out of eleven—the correct answer. You are trying way too hard, and your tortured distortion of the sample space doesn't make sense. Your interpretation would have the chances of rolling 1-1 with two dice be 2/36, which is absurd and equally wrong.Posted by: Anonymous | February 11, 2007 8:37 PM

fizzle:

If I flip two coins once, don't see the result, but am told that one of them is a head, I would bet a penny that they are different. Same reasoning as with the Monty Hall problem.

If I repeat it, and this time, I'm told that one of them is a tail, I would bet a penny that they are different. Same reasoning as with the Monty Hall problem.

If we make an iterated game out of it, I wouldn't be able to make a reliable analysis without some sort of codified discussion of the observer's exact algorithm for deciding what to tell me.

Posted by: Mikael Johansson | February 12, 2007 1:00 AM

That I would be willing to admit! Please direct me straight to the published conventions on describing probability problems with words ;-)For that you have to go to class. :)

Posted by: jkao | February 12, 2007 7:44 AM

do you know any programming languages? if yes, write a simple program:

1. generate two random integers from 1-6.

2. if neither one is a 4, go back to 1 and repeat. else, continue.

3. record whether the sum of the die equals to 7.

run this a few thousand times and count the proportion of times you are returned that the sum equals 7. it will be very close to 2/11, rather than 1/6.

Posted by: alex | February 14, 2007 6:27 PM

Further to alex's post, I get 18.18% - not a mathematical proof, but a very strong suggestion towards where the correct answer lies.

int roll()

{

return (rand() % 6) + 1;

} /* roll */

void main( )

{

long

Total=0,

Count=0;

int

Die1,

Die2;

while(1)

{

Die1 = roll();

Die2 = roll();

if( Die1==4 || Die2==4 )

{

if( Die1+Die2 == 7 )

{

Count++;

} /* if */

Total++;

} /* if */

printf( "%f% \r", 100.0*Count/Total );

} /* while */

} /* main */

Posted by: David Swart | February 15, 2007 3:46 AM

As I think most of the commentors have agreed, the dispute is about translating the English question into a precise mathematical problem; not about solving the problem once you have it. The professor is following the standard conventions for how statisticians use language. This is a reasonable thing for him to test; although I have no idea whether it is his fault or yours that you haven't learned it in the course of the term. I've found this to be a very hard thing to pick up myself, but I can try to explain this point.

"Random event R takes place. Given condition C, find the probability of condition D"

means to compute

Probability of (D and C)/Probability of (C).

This is what is often denoted P(D|C), the conditional probability of D given C.

At times, it is difficult to imagine a physical scenario which would lead to this computation. In this case, what we have to imagine is that there is a maniac sitting in the room with the dice roller, who can be counted on to shriek at the top of his lungs if any four is rolled. (And he shrieks just as loudly if two fours are rolled.) Out of your sight, a man is rolling dice over and over. Suddenly you hear a shriek, and a bookie comes up and offers to bet that the dice came up 7. What odds should you accept?

While the definition of conditional probability is peculiar, it has the virtue that (a) it is well defined without specifying any information beyond the probability space and the two events and (b) it is always, always true that P(C and D)=P(C|D) P(D).

Indeed, conditioning on highly nonphysical things is, in my limited experience the key to proving all sorts of peculiar theorems. It is also, in my opinion, why probability papers are so darn hard to read :).

Regarding the coin flip paradox raised by fizzle, if the person deciding what to tell me is the coin flipper then I would not bet in this game. He is presumably deciding what information to reveal according to his own best interests. (This same ambiguity plagues the Monty Hall problem.) On the other hand, if the coin flipper promises ahead of time that he will tell me whether or not a head appears, and he then flips the coins and tells me that a head has occurred, I would certainly bet that the coins are different.

Hope this helps.

Posted by: DavidSpeyer | February 15, 2007 6:19 AM

Maybe its me, but I am not fully comfortable with the preceding analyses.

"Given one of the dice shows a 4" is surely equivalent to "At least one dice shows a 4", at least that is what I would take it to mean straightaway.

Also, information from throwing the dice in succession may not be quite the same as information from throwing them simultaneously. Given the information "The first die shows a 4", then the probability that the second is 3 is of course 1/6. This is equivalent to labelling the dice and giving the information "Die 1 shows a 4" in the case of simultaneous throws. But that is not the same as "At least one is a 4".

Change the problem to an urn model. Two urns each have a set of chips numbered 1 thru 6. Someone makes a blind draw from urn A and urn B. The is a difference between "The chip from urn A shows a 4 (Heath Raftery's case)" and "At least one of the chips shows a 4 (the lecturer's case)" or equivalently "One of the chips shows a 4".

So you made the right choice of your answer, maybe for the wrong reasons.

Don't feel bad though, there is a relationaship between this apparent paradox and the famous one known as the Monty Hall problem, which I urge you to investigate. Only by this means (enlightening ourselves over thorny logical points) can we come to fully understand probability, which is a logic not "hard wired" in our brains.

Posted by: Toby Joyce | February 18, 2007 12:31 AM

David, great response, appreciated. I think this is the crux of the issue: I interpreted the question as a description of a real event. The lecturer simply wanted me to think within the walls of the lectures. This is a classic pedagogical mistake in my opinion, but common nonetheless.

Quite simply, if one treats the word "given" in the problem as the equivalent of the pipe symbol in P(A|B), then it's simple - 2/11. If one considers the reality of the situation described, then there is a leap of faith to assume that non-complying scenarios are being disregarded.

Thanks to all those that have responded. Clearly some of you have not read the entire thread, which is understandable since it is so long, but regrettable because there's a lot of repeated material here. I wont address each case individually. Thank you especially to the well reasoned and well thought out comments. I very much enjoy the argument you're offering.

Posted by: LightYear | February 18, 2007 12:16 PM

I tried a million iterations of the following program with Matlab and got

>> probparadox

ans =

2.0103 1.0965

>> probparadox

ans =

1.9924 1.0867

The first number is /11 while the second is /6.

Without a doubt, 2/11 wins

n4=0;

n7=0;

for i=1:1000000

x=ceil(6*rand(1));

y=ceil(6*rand(1));

if (x==4 & y~=4)

n4=n4+1;

if x+y==7

n7=n7+1;

end

end

if (y==4 & x~=4)

n4=n4+1;

if x+y==7

n7=n7+1;

end

end

if (x==4 & y==4)

n4=n4+1;

end

end

[(n7/n4)*11 (n7/n4)*6]

Posted by: Steph | February 19, 2007 10:47 AM

Yep, the only ambiguity is in whether the meaning is "only one die shows a 4" (in which case there are 10 possible outcomes, of which 2 are equal to 7, so the probability is 2/10), or "at least one of the dice shows a 4", in which case there are 11 possible outcomes (since 4-4 is now allowed) and the probability is 2/11. Getting angry with the prof is not helping you- I think the emotional frustration has been blocking you from getting the picture.

Now, if the question had been: "I roll two dice. I choose one of the dice at random and look at its value. I find that it's a four. What is the probability that the total of the two dice is 7?", then the answer would indeed by 1/6, as we're asking what the probability is of the other die showing a 3.

Posted by: Stephen Wells | March 1, 2007 9:35 AM

I'm sorry to be annoying, but I think you are still missing the point - with the "at least one" there is no ambiguity in the original question! Also I feel that in your post you decide your professor was wrong too quickly. Even now, you are still discrediting your professor by saying that there is a "leap of faith" in the what is meant by the notion of "given". Actually, you do not have to assume some abstract definition of "given" - the mathematical definition corresponds with the natural everyday notion. In fact it is your solution which is quite contrived, as the oracle giving you the information does so in an extremely unusual way.

If you are given something happens in a problem... you take as a _given_ that it happens! You say that the problem doesn't explicitly state "non-complying scenarios are being disregarded". Of course they are, what else would you mean by given? Are you really saying that if the problem was

"Given that a single dice roll was a 6, what is the probability it was a 1?"you would answer 1/6 because you can't disregard non-complying scenarios?You are not describing the solution I posted. Re-read it, and if you still believe I said what you think I said, quote me.You say "Suppose that this observer made their observation by announcing the value of the die that first stops rolling.", which is immediately wrong because the observer must consider both dice.

A specific number? Like a "4"? Or a "5"?Yes, a specific number; 1, 2, 3, 4, 5 or 6. But you say that since one of these numbers always appears you could conclude your "chances of guessing the 7 total are better". As Kenny pointed out, the possibilities are not mutually exclusive.

But there was nothing about special 4-producing dice in the question, so I didn't assume that was the case. What makes you think it is?Rolling 4-producing dice is only one way to think about it, another way is to imagine that two regular dice are rolled and an oracle gives you the specific information and nothing else.

Posted by: CuBr | March 3, 2007 5:41 PM

This is similar to a problem I'm having which brought me to this page. Maybe someone can clear up my problem. I'll take a shot at the original.

Original Problem:

Okay, so you have 2 dice. You roll them both and close your eyes. Your buddy looks at the dice and shouts out one of the numbers: "Four!", he yells, melodramatically. Now you have to say what the probability that the two dice together equal 7.

You know that the number of outcomes that include a four (counting that double once, since it's only one outcome, the sneaky devil!) is 11. You know that of those outcomes, only 2 of them add up to 7.

Quick note: your friend is actually Charles Manson and he's told you that if you don't guess the right probability, he's going to cut off your left foot. While working out the math in your head, you realize how lucky you are that he said he only wanted the probablity of the total equalling 7, and didn't say you had to corretly guess what the number was on the second die.

Okay, so you know that the number of outcomes that include a 4 is 11, and you know only two of those outcomes equal 7, so you correctly shout "Please don't hurt me! The probability is 2/11!" and he cackles and runs off to find another victim.

Now, your frustration seems to be rooted in the semantics of the question. You are reading the problem as "Charles Manson wants to cut off your foot, and agrees not to if play a weird math game with him. He makes you roll two dice with your eyes closed and then shouts 'One of them is a four!' and then demands that you give the probability that the other die had while you were rolling them of coming up as the three needed for the dice to total 7." In this scenario, you're totally right: the odds of either die being part of the total of 7 is 1/6. The outcome of either die does not change that probability. Knowing the outcome of one die should not change your expectations of what the second die will be, but does change your expectations of what the total sum will be. If you roll two die, and the first one is a 3, you can't say "well I still have a 1/6 chance for rolling snake eyes!"

So your reading of the question was "what is the probability that the second die will be a 3 thus making the total 7" versus the professor's intention, which was "what is the likelihood that the total sum will be 7 if you know that one of them 4". If I was taking that test, I would have read it the way the professor intended, simply because if I'm given extra information, I tend to assume that it's meant to be used toward the problem, at least when faced with a possible ambiguity.

Part of the problem (I think) is that the sum is supposed to equal 7, and any throw on one die leaves the chance to get a 7 with the other die. Let's say you were playing a modified version of craps where you bet up front what number (total) you are going to throw, you throw the dice under the table. The dealer looks under the table and yells out the outcome of one of the dice. You are allowed to either fold, stand, or double your bet for the second throw. You bet you will throw a total of 5 because 5 is your lucky number. You know that before you throw the first die that your probability of throwing a total of 5 is 4/36 or 1/9. You throw the first die and you get a 5. What is the probability of getting a 5 now? Or, you roll a 3. Are your odds still 1/9, leading you to stand? Or are they 2/11, leading you to double?

I have trouble getting this stuff straight all of the time, my instinct is like yours because it was hammered into my head in grade school that the amazing thing about probability is that each event is exclusive and the probability is the same for each event. Thus it really is possible to throw a coin 100 times and have it be heads each time.

Or, to look at it another way (man, I'm going to be killed for making up math):

If you are throwing the dice in your own scenario, where you say "no matter what the first die is, the odds of the second one are definitely 2/11!" the problem with that situation isn't that your odds are better for getting a 7 no matter what, it's that the probability for the second die DOES change based on the first one, but it doesn't matter since in the case of 7 the odds change for all possible situations. You don't stand a better chance, you just know that the second die has a 2/11 probability based on the first one. If both dice are thrown at the same time, each one has a 1/6 (or 2/12) probability because they are exclusive from each other, but when you look at one before the other, the probability changes from 2/12 to 2/11 because the first die takes away the possibility of the total not including that number.

That last part was totally out of my butt, I admit. Can anyone infer what I'm getting at and set me straight?

Okay, on to my problem, which is (hopefully) much simpler:

When playing backgammon, there are times when you get bumped off the board and there is only one free space for getting back on the board. You have two dice and have to roll the number of that space to get free and back in the game. At least one of the two dice has to be that number, it can't be the total of the dice (so if the number to get free is 6, you're stuck and very pissy if you roll two 3s.) I was trying to figure out the probability of rolling that number in this situation. My instinct was 1/6, but then I started writing out the possibilities to be sure. If there are 11 different combinations with a 6 (and 6 is the number you need to roll), that means that I have a 11/36 chance of getting the number I need. This is all well in good, but that would mean that the probability of rolling and getting a number I don't want (let's say a 2) would be 11/36 and thus, the odds of getting any number would be (11/36)*6, which equals 66/36 (and I would have a really good shot at getting the number, which also seems wrong). That doesn't seem right at all. If I change it up a bit and instead figure out the number of possible 6's total (not counting the double) over the total amount of individual dice possible (11/72) then my odds are closer to fair, but I end up with 66/72 when I multiply the odds out for each number. The only way I can work out the missing 6 is to say those are the double thrown out, but if I account for the doubles for any number I'm not trying to get, I still end up with 11+(12*5) over 72 (71/72). So really, my question is: how do I figure out the odds for throwing the number I need? do I throw out the doubles? Do I throw out all the doubles or just the one for the number I need?

Posted by: Tony | April 3, 2007 1:18 AM

CuBr:

Re-read the original question. We were told that two dice were rolled. Boom - 36 possible outcomes. Then we are "given" that a 4 appears. I interpret that as being "given" more information, not that the scenario is a given. Subtle, isn't it? As I've said, I could have been "given" the information that a 1 appears, or a 2, 3, 5, or 6 appears. That doesn't change the fact that two dice have been rolled simultaneously. Read the way I contrast this situation with the women with two children problem for more clarification.

Read the next paragraph - this was only a possible scenario. I next consider the other scenario. Correct, but that doesn't resolve the paradox. They may not be mutually exclusive but they areexhaustive. One of those statements could always accurately be made, so how can simply stating it increase my odds? I am imagining that. You call it an oracle and I'll call it an observer. Doesn't change the fact that 2 fair dice were rolled.Tony:

Yep, knowing one die shows a 4 means I'm sure the total will be 5, 6, 7, 8, 9 or 10. I maintain that each total is equally likely. Others do not. We all agree that the total can't be 2! Ah... I think we're all aware that if one die is a 4, the other must be a 3 for the total to be 7. I believe those questions are the same. The new odds are 1/6 - I have to get a 2 on the next die. Chances of rolling a 2 are 1/6. Easy. Ah yes, knowing the value of the first die does indeed guarantee that the total includes that value. But it does not change the outcome of the second die. It still has a 1/6 chance of producing any particle value in [1, 6]. We know the total is now in [first die+1, first die+6], but that doesn't change the likelihood of each total. There are three simple ways to find out the odds of rolling the number you need with two dice. Your first attempt after the instinctive guess uses a correct and easy method - just count up all the scenarios with a '6' in them from the 36 outcome sample space - 11/36. Correct. Two other ways are to add the probabilities for each die to arrive at a favourable outcome. I will denote the probability of a favourable outcome with P(F), unfavourable as P(F)', and the probabilities of the first and second die as P1() and P2(). Statistically AND is denoted with x and OR with +.P(F) = ( P1(F) x P2(F)' ) + ( P1(F)' x P2(F) ) + ( P1(F) x P2(F) )

In other words, you are successful if the first die is successful only, the second die is successful only, or both dice are successful. Let's stick the numbers in:

P(F) = ( 1/6 x 5/6 ) + ( 5/6 x 1/6 ) + ( 1/6 x 1/6 )

= 5/36 + 5/36 + 1/36

= 11/36.

Same answer. Finally, this can be simplified by considering that a favourable outcome is the conjugate of an unfavourable outcome. That is:

P(F)' = ( P1(F)' x P2(F)' )

= 5/6 x 5/6

= 25/36.

To find the conjugate (the favourable outcomes), we subtract from 1:

P(F) = ( 1 - P(F)' ) = 1 - 25/36 = 11/36.

Right, so that's sorted. Now you wisely tried to double check your answer by considering other scenarios. You are right in assuming the favourable outcomes for any number (not just a '6') are still 11/36. You were wrong in implicitly assuming each case was mutually exclusive. The odds of getting a '6' overlap with the odds of getting a '5'. In other words, the outcomes are not mutually exclusive. To add the probabilities, we would have to remove the overlapping outcomes:

P(5 or 6) = P(5) + P(6 without a 5)

= 11/36 + (11/36 - 2/36)

= 20/36

And so on for the other numbers:

P(any number) = P(6) + P(5 without a 6) + P(4 without a 5 or a 6) + ...

P = 11/36 + (11/36 - 2/36) + (11/36 - 4/36) + (11/36 - 6/36) + (11/36 - 8/36) + (11/36 - 10/36)

= 11/36 + 9/36 + 7/36 + 5/36 + 3/36 + 1/36

= 36/36.

Posted by: LightYear | April 8, 2007 2:20 PM

you people have way to much free time

Posted by: Anonymous | April 10, 2007 8:58 AM

Anonymous, I appreciate you writing in and voicing your insightful thoughts. I'm impressed you took the time to read and comment on something you're clearly not interested in. Perhaps you'd also like to comment on a English language blog somewhere, since you're clearly uninformed in that area as well.

Double check your watch - you'll find that we both have the same amount of time available to us. I choose to host a discussion on a topic I find interesting. You choose to grace strangers with a pathetically formed insult.

Move along. You'll find plenty of other people who believe showing interest in intellectual pursuits is a sign of too much free time. You can help each other achieve nothing together, lest one of you be seen to be pursuing an interest.

Posted by: LightYear | April 11, 2007 10:04 PM

My initial take was that the lecturer was wrong to count reflections (if that's what you would call them -- I mean, e.g., both 4-3 and 3-4), because they are really two statements of the same event, i.e., "one die shows a 4." You don't care which die it is; therefore five of the events in the lecturer's sample space are redundant and the chance of a total seven is 1/6.

But you brought a great insight: if the lecturer wants to count reflections, he has to count 4-4 twice. So the sample space is 12 and the chance is 2/12 or 1/6. Bravo.

(I didn't read through all the comments, so I apologize if I'm repeating something that's already been said.)

Posted by: mgarelick | April 12, 2007 3:18 AM

Interesting perspective mgarelick, thanks for sharing. You've definitely added something new to ponder.

I just re-read Tony's comment and thought of an important clarification: I am

notdisregarding the new information offered to me about the 4 appearing. It is actually a coincidence that my answer doesn't change! Here's why:Chance of getting a 7 total from two dice: 6/36 = 1/6.

Chance of getting a 3 when I've already rolled a 4 (and therefore adding to 7): 1/6.

Don't let the fact that they are the same number lead you to believe that I'm disregarding the new information. As Tony cleverly pointed out - if I was rolling for a total of 3, and I roll a 4 first, my chances of getting a 3 total have plummeted! But, if I'm rolling for an 8 total, my initial likelihood is 5/36 - if a 1 comes up I'm buggered, but if a 2, 3, 4, 5 or 6 comes up, I would then have a 1/6 chance of getting my total.

Posted by: LightYear | April 12, 2007 11:50 AM

In our Stats lesson today we discussed this "Probability Paradox" and how absolutely absurd it is to have these different logical paths leading to two different, but completely justifiable answers, one of which is conventionally acceptable, and one of which is acceptable to normal human reasoning.

We discussed that the understanding of the complexes of probability do not occur naturally to the human mind. More often than not the "technically correct" answer is not normally logically justifiable.

The amount we choose to spend ruminating on a particular problem is dependent on the motive we have behind studying the subject. If your aim is just to give the technically correct answer, you will not go beyond the 2/11ths answer that you have been taught. But if you choose to think more about this problem and its solution (or lack of it) the more disillusioned you get with the 2/11ths answer that is conventionally acceptable. The mind then veers towards this solution of 1/6th which initially seemed much too obvious to be the correct answer.

This is the problem with such questions, you begin to question the solutions you see and veer on to the question of WHAT exactly is an ACCEPTABLE solution in the first place...

Posted by: Rohan S | November 24, 2007 10:10 AM

LOL AIDAN WE ALL KNOW THIS IS UR BLOG HAHAHAHAHAHAHAHA

Posted by: MOTOROTO | November 26, 2007 4:30 AM

Thanks for writing Rohan, that's an interesting take on things. It just goes to show how sloppy seemingly precise language is when it is subjected to mathematical rigour.

Who's Aidan? This is my blog!

Posted by: Heath Raftery | November 26, 2007 7:53 AM

What a lot of useless talk! Talk until the Big Crunch, but the arbiter is the EXPERIMENT. And I am a theorist! - still I worship at the alter of experiment! I care not a whit for all this talk on who is right. Abandon all useless theory, ye who enter here!

Pick up a pair of dice, sit down with a piece of paper and some ink, and start throwing and inking. Throw away all events that are not in the domain (sample space) i.e. no 4 showing. If there is a 4, log the throw as a countable event. Is the total 7? Yes, log it as a winner, else a loser. After some finite time, your winner/event count will converge on a number. What number is it?

Now is the time for explaining the number!

Since you seem to have identified the real problem, whether the chance of 4-4 is 1/36 or 2/36, that is an even easier experiment to do, with a lot more spread between the hypothesized outcomes than between 1/6 or 2/11. Very quickly you see that the chance of 4-4 is 1/36, which is why doubles are so highly valued in dice games.

All this gab reminds me of the time when everyone knew big balls fell faster than small balls. No one would have sullied their hands actually doing it! Really, you people would have saved a lot of time actually doing it instead of debating it.

I am an educator, so this is not a criticism of all of you per se, it is a criticism of our education system that we do not teach problem solving.

There is a very clear theoretical method for solving this problem. Since the dice are independent, and the chance of each number for a given die the same, write a matrix - 1-6 across the top and 1-6 down the left. We all agree that ALL possibilities are covered correctly.

Now fill in the combinations, giving the ALL 36 possible throws on the two independent dice. Note well that there are only 6 doubles, along the diagonal. 4-4 does not appear twice, as do 4-3 and 3-4. Striking all rows and columns without a 4 leaves 11 numbers, of which exactly 2 add to 7. 2/11 is what the theory says, and 2/11 is what the experiment verifies.

It is a mistake to think 4-4' is distinguishable from 4'-4, because the numbers are indistinguishable, ie 4' from 4. We say in Quantum Mechanics that the dice are indistinguishable. What we mean is that the throws are independent, and the results of one dice indistinguishable from the results of the other (the numbers are independent). With dice, there are four independence games that can be played â€“ O (order), D (distinguishable), OD (both), and M, the Matrix game we just did, which has neither O nor D. It turns out, M covers them all, that is, the theory applies to all of them.

Yes, I know, we can make the dice distinguishable, one could be Big RED and the other small blue - but this is precisely game â€œDï¿½ï¿½?. (R4,b3) is distinguishable from (b4,R3), but (R4,b4) is indistinguishable from (b4,R4). By indistinguishable using distinguishable dice, we mean: The ORDER DOES NOT MATTER. RED first is the same as RED second. 4+3=3+4=7.

Even if we try making Â¾ the winner, the analysis does not change.

If you wish to abandon the distinguishable dice, and go to order matters, fine, this is game â€œOï¿½ï¿½?. (3,4) is different from (4,3), but (4,4) can only occur 1/36 throws.

So clearly, â€œOï¿½ï¿½? and â€œDï¿½ï¿½? are the same games as â€œMï¿½ï¿½?, which I shall simply call â€œMï¿½ï¿½?.

Now if you want to have your cake and eat it too, order matters and distinguishable dice, game â€œODï¿½ï¿½?, fine. We can do that experiment too.

Now (R4,b3) is different from (b3,R4) is different from (R3,b4) is different from (b4,R3), and of course (R4,b4) is different from (b4,R4). But that's it. You got no free lunch. There are now 72 possibilities and 22 of them have a 4, and 4 are winners, and we are back to 4/22, or 2/11.

What you CANNOT do is choose to COUNT the â€œODï¿½ï¿½? doubles (R4,b4) and (b4,R4) (odds 2/72) with the â€œMï¿½ï¿½? winners (3+4) or (4+3) (odds 2/36). You can match odds, but not counts from different games.

So the odds in â€œMï¿½ï¿½? and â€œODï¿½ï¿½? are the same. How do I know? The real answer is the experiment, but all of this arises from dice that are INDEPENDENT. Therefore one die does not know what the other says. Perhaps instead of indistinguishability, we should say independence. The idea is subtle, but from the matrix it jumps out at you.

After all this yaking, no one came even close!

The answer to 6+6? Doesn't this depend on the base? I mean 10 is a perfectly good answer!

Q: How many possible answers are there, given all possible bases?

Posted by: bb | February 7, 2008 10:06 PM

Hi all, I'm new at this "paradox", googled for it, and found this disussion. I read almost the whole thing, and lost almost the entire night sleep over it. Still I'm trying to contribute to the dispute.

I drew the matrix of all possible outcomes, as "bb" suggested. As expected, there is a diagonal of cells that add up to 7. It consists of 6 cells. Also, there is a "cross" of cells that contain AT LEAST one 4. This cross consists of 11 cells. The same is true for all the crosses of cells containing at least one 1, 2, 3, 5, or 6. All of these 6 crosses of 11 cells have 2 junctions with the diagonal of 6 cells with sum 7. So it is clearly visible that 2 out of 11 cells for each cross are winners, if you bet on outcome 7.

Therefore, if we restrict the information given by the observer to one particular cross, the odds are clearly 2/11. If the player asks the observer: "is there at least one 4 (or other number)?" He is in fact asking whether or not the unknown cell is on one particular cross. The answer can be yes or no. The same is true if the observer is in another way restricted to give iformation only about one of the six "at least one N" crosses (i.e. to reveal the prescence of one specific number on one or two dice).

However each cell is on *two* different "at least one N"-type crosses. For example, the cell (2,4) is on the "at least one 4" cross, but also on the "at least one 2" cross. This has implications, depending on the strategy of the observer. Let me explain:

If the observer is free to choose which die he reveals, the information is worthless, as I will explain below. And this is how I interpreted the original question. I assumed that he always told the player: "there is at least one "N", after randomly picking one of the dice. In other words, the observer would randomly pick one of the two "at least one N" crosses that the cell is on. That means that each cell on the "at least one 4" cross has a 50% chance to be revealed as part of another cross. Each cell? Wait a minute! What about the (4,4) cell? That one is on *just one* cross. It can only invoke the hint: "there is at least one 4", whereas the other cells have only 50% chance to invoke the same information. The same is true for the (2,2) cell, in case the hint can only be: "there is at least one 2" etc. These cells make up the other diagonal of the matrix. Each cross has exactly 1 junction with this diagonal.

Given the assumption that the random strategy is followed, the original poster was right to count the double numbers twice, or at least attributing to the hint: "there is at least one N" a twice higher probability to come from a double number, than from another pair. In hat case, 1/6 of all the hints "there is at least one 4" will come from the pair of two 4's.

Randomly picking a die to reveal spoils the information.

In my opinion, such a strategy is not excluded by the words: "given that there is at least one 4", but it IS excluded if the player (by asking), or some rule, can restrict the information given by the observer, to one specific number. But I'm open to corrections about that view. And to the rest of my typings of course.

Tougher than the Monty Hall problem, this one. More seductive too, maybe because 2/11 and 1/6 are so close. But fun anyway.

Posted by: Justin Case | July 14, 2008 2:41 AM

Hi all, I'm new at this "paradox", googled for it, and found this disussion. I read almost the whole thing, and lost almost the entire night sleep over it. Still I'm trying to contribute to the dispute.

I drew the matrix of all possible outcomes, as "bb" suggested. As expected, there is a diagonal of cells that add up to 7. It consists of 6 cells. Also, there is a "cross" of cells that contain AT LEAST one 4. This cross consists of 11 cells. The same is true for all the crosses of cells containing at least one 1, 2, 3, 5, or 6. All of these 6 crosses of 11 cells have 2 junctions with the diagonal of 6 cells with sum 7. So it is clearly visible that 2 out of 11 cells for each cross are winners, if you bet on outcome 7.

Therefore, if we restrict the information given by the observer to one particular cross, the odds are clearly 2/11. If the player asks the observer: "is there at least one 4 (or other number)?" He is in fact asking whether or not the unknown cell is on one particular cross. The answer can be yes or no. The same is true if the observer is in another way restricted to give iformation only about one of the six "at least one N" crosses (i.e. to reveal the prescence of one specific number on one or two dice).

However each cell is on *two* different "at least one N"-type crosses. For example, the cell (2,4) is on the "at least one 4" cross, but also on the "at least one 2" cross. This has implications, depending on the strategy of the observer. Let me explain:

If the observer is free to choose which die he reveals, the information is worthless, as I will explain below. And this is how I interpreted the original question. I assumed that he always told the player: "there is at least one "N", after randomly picking one of the dice. In other words, the observer would randomly pick one of the two "at least one N" crosses that the cell is on. That means that each cell on the "at least one 4" cross has a 50% chance to be revealed as part of another cross. Each cell? Wait a minute! What about the (4,4) cell? That one is on *just one* cross. It can only invoke the hint: "there is at least one 4", whereas the other cells have only 50% chance to invoke the same information. The same is true for the (2,2) cell, in case the hint can only be: "there is at least one 2" etc. These cells make up the other diagonal of the matrix. Each cross has exactly 1 junction with this diagonal.

Given the assumption that the random strategy is followed, the original poster was right to count the double numbers twice, or at least attributing to the hint: "there is at least one N" a twice higher probability to come from a double number, than from another pair. In hat case, 1/6 of all the hints "there is at least one 4" will come from the pair of two 4's.

Randomly picking a die to reveal spoils the information.

In my opinion, such a strategy is not excluded by the words: "given that there is at least one 4", but it IS excluded if the player (by asking), or some rule, can restrict the information given by the observer, to one specific number. But I'm open to corrections about that view. And to the rest of my typings of course.

Tougher than the Monty Hall problem, this one. More seductive too, maybe because 2/11 and 1/6 are so close. But fun anyway.

Posted by: JustinCase | July 14, 2008 2:44 AM

Hi all, I'm new at this "paradox", googled for it, and found this disussion. I read almost the whole thing, and lost almost the entire night sleep over it. Still I'm trying to contribute to the dispute.

I drew the matrix of all possible outcomes, as "bb" suggested. As expected, there is a diagonal of cells that add up to 7. It consists of 6 cells. Also, there is a "cross" of cells that contain AT LEAST one 4. This cross consists of 11 cells. The same is true for all the crosses of cells containing at least one 1, 2, 3, 5, or 6. All of these 6 crosses of 11 cells have 2 junctions with the diagonal of 6 cells with sum 7. So it is clearly visible that 2 out of 11 cells for each cross are winners, if you bet on outcome 7.

Therefore, if we restrict the information given by the observer to one particular cross, the odds are clearly 2/11. If the player asks the observer: "is there at least one 4 (or other number)?" He is in fact asking whether or not the unknown cell is on one particular cross. The answer can be yes or no. The same is true if the observer is in another way restricted to give iformation only about one of the six "at least one N" crosses (i.e. to reveal the prescence of one specific number on one or two dice).

However each cell is on *two* different "at least one N"-type crosses. For example, the cell (2,4) is on the "at least one 4" cross, but also on the "at least one 2" cross. This has implications, depending on the strategy of the observer. Let me explain:

If the observer is free to choose which die he reveals, the information is worthless, as I will explain below. And this is how I interpreted the original question. I assumed that he always told the player: "there is at least one "N", after randomly picking one of the dice. In other words, the observer would randomly pick one of the two "at least one N" crosses that the cell is on. That means that each cell on the "at least one 4" cross has a 50% chance to be revealed as part of another cross. Each cell? Wait a minute! What about the (4,4) cell? That one is on *just one* cross. It can only invoke the hint: "there is at least one 4", whereas the other cells have only 50% chance to invoke the same information. The same is true for the (2,2) cell, in case the hint can only be: "there is at least one 2" etc. These cells make up the other diagonal of the matrix. Each cross has exactly 1 junction with this diagonal.

Given the assumption that the random strategy is followed, the original poster was right to count the double numbers twice, or at least attributing to the hint: "there is at least one N" a twice higher probability to come from a double number, than from another pair. In hat case, 1/6 of all the hints "there is at least one 4" will come from the pair of two 4's.

Randomly picking a die to reveal spoils the information.

In my opinion, such a strategy is not excluded by the words: "given that there is at least one 4", but it IS excluded if the player (by asking), or some rule, can restrict the information given by the observer, to one specific number. But I'm open to corrections about that view. And to the rest of my typings of course.

Tougher than the Monty Hall problem, this one. More seductive too, maybe because 2/11 and 1/6 are so close. But fun anyway.

Posted by: Justin Case | July 14, 2008 2:46 AM

You are right - the probability is 1/6. You very correctly pointed out that if the person told you there is at least a 1, a 2, a 3, a 4, a 5, or a 6 that the outcome would be 2/11. So how could it be that it is 2/11 when for any of the outcomes the person can tell you it is at least one of these? That would imply that the real probability is 2/11. I thought hard about this and I know if I run a stochastic model it is 1/6 if I dont look and 2/11 if I do. Strange. but here is where the problem is wrong. The only way it is 2/11 is if the person promises that he is going to shake the dice and only when at least on of them is a 4 will he then ask you the probability that the combination is 7.

Any problem that is based upon an already run outcome means nothing because you can always tell someone something about the outcome but you have not promised what you are going to say in advance and if it does not meet that outcome you are going to find another that does before asking the questions. So Marilyn is wrong in many cases where she happens across an outcome (woman on the phone has two dogs and at least one of them is a female). the odds the other dog is a male is 50/50. :)

Posted by: Chris | November 27, 2008 4:21 AM

I am quite surprised that you got everything right. As you can see from many of the other comments, most people (I'm not sure whether it is "even" those, or "especially" those in the field) get stuck in a rut and can't see the whole picture.

The only real issue is interpretation, and you've defined the interpretation for each answer.

The one who claims you don't understand the word "given", either doesn't understand it himself, or (and this seems the case) he ignored everything you said except your argument about doubling the 4-4 probability if you are being told there is a four. He would have a point if the problem specified "at least one", and you were arguing that being told "at least one is a 4" still gives 1/6 (which it would if 4 had the same significance as any other to the person telling).

To assume that "one" implies "at least one" or "exactly one" is absurd. It can be interpreted in either of those ways, but my guess is that most native speakers would interpret it as you did. Hence your interpretation is the best (from an language standpoint at least).

As for the mutually exclusive and exhaustive sets. If instead of saying "one of the 6 numbers always appears", you said "one of the 6 numbers is always given", you would have it. So you didn't make an implication in your argument explicit, big deal. Anybody with half a mind could follow it.

In conclusion. Your original post is essentially complete and definitely accurate (Would that Marilyn Vos Savant could be as complete in her answers). The fact that you have so many arguing against you is a testament to the delinquency of our educational system (or is it the inability of most human brains to comprehend logic?).

Posted by: John | February 4, 2010 3:52 AM

Hi there Heath. Thanks for raising an interesting issue.

David Speyers' comment and your response of Feb 18,2007 pretty much nailed this three years ago, but a recent related incident may be of interest.

I will get to that in a minute, but first let me summarize what I think is the outcome of this discussion:

The wording as a "common language" problem was ambiguous with at least four reasonable interpretations

1. Exactly one die shows a 4 -> p(7)=1/5

2. At least one die shows a 4 -> p(7)=2/11

3. A specified one of the two shows a 4 -> p(7)=1/6

4. A randomly chosen one shows a 4 -> p(7)=1/6

It was also intended (but not clearly stated) that the phrase "given that" is to be interpreted in the standard technical sense of conditional probability (which you correctly dealt with in the epilogue to your posting - and which a number of the responders above modelled with monte carlo programs).

The lecturer intended case 2 and you interpreted it as case 4. My only criticism of your complaint would be that the ambiguity of colloquial language is the *reason* we have technical definitions (it's not just willful pedantry for its own sake) and so your resistance is a bit like the student who denies that |x| is continuous because he can't draw it in a continuous motion without stopping at the corner. So in one sense I am tempted to say "We have to make a choice of interpretation, and in technical discussions we want to be sure that everyone is making the same choice. You were told the definition to assume so get over it".

But on the other hand I too am often annoyed by pedants who place too much emphasis on conventions as opposed to understanding - especially when they expect technical conventions to be applied in "word problems" that are expressed in colloquial language (where in my view the correct response is to identify the ambiguity rather than to identify one "correct" answer).

Sometimes, such pedants even assume the context of a technical definition when the colloquial language is clear enough to show that it does *not* apply. In your case this didn't happen, but if the problem had said "given that you (accidentally) see one of the dice is a 4" rather than "given that (at least) one is a 4", then your answer would have been unambiguously right and the lecturer would have been just plain wrong. (In which case you should *never* have given in - some matters of principle really are worth fighting to the death over!)

The recent incident that I referred to involves high level math students and some professional mathematicians responding incorectly to the "Glimpse a Heart" version of your problem. See recent issues of Math Horizons, and the discussion at http://qpr.ca/blog/2009/11/21/have-a-heart/ for more on this (including a poll where you can vote for what you think is the correct answer).

Posted by: Alan Cooper | February 12, 2010 5:08 AM

John:

Thanks for your comments. Perhaps the failure to comprehend is not so much a measure of delinquency, but an underestimation of how unintuitive these simple problems can be. Often, I think, people will arrive at an intuitive conclusion and then defend it to death lest it be shown that their "gut feel" on seemingly simple matters is faulty!

Alan:

Good summary, but even the phrase "given that at least one die" doesn't capture the requirement that scenarios where this did not happen have been thrown out. I could still interpret "given that at least one die" as being an observation after two dice are rolled rather than a precondition on the roll. But I see your point about it suggesting a particular technical interpretation.

The Glimpse a Heart problem is a beauty! But after some careful study I realise it is just two variations from the dice problem, neither of which change the game. The first change is trivial - the six options of a die have been replaced with the 52 options of a deck. That makes it harder to rationalise but makes no material change to the algorithm. The second change is that the cards are "not replaced". That is, once you have one, there are only 51 options for the second card. With two dice, both dice have 6 options. While this changes the calculation from a permutation to a combination, the logic still holds.

The problem prompted me to propose a third problem (such as fizzle did in the comments above) where the numbers are more immediately accessible. Suppose I'm watching some two-up (a game played in Australia on Anzac Day where two fair coins are tossed and people bet on the three outcomes, "heads", "tails" and "evens") and decide to employ this "extra information" trick. First let me give a head a score of "1" and a tail a score of "0", so I can use the dice terminology. If the coins are tossed and I "glimpse" one, or am told the value of one, or one falls in my view, and I now know that one of them is a head. What are my chances that the sum of the coins is 1 (or equivalently, the coins are different)? The sample space argument, and the corresponding "combinations" argument in Glimpsing A Heart, shows that the chance is 2/3, which is much higher than the 1/2 I started with!

Of course, this is baloney - I don't get a better chance just by seeing one of the coins because the other coin still have a 50/50 chance of being the one I need.

What would give me a better chance was if I (or my informer) made the choice before hand - "I'm not going to bet unless one at least one of the coins comes up heads". Then, I would watch and *disregard* all those games where my precondition is not satisfied. Once a favourable game came up, then yes, my chances of "evens" are better at 2/3.

This is analogous to what happens in the Monty Hall problem, which is also astonishing because you get a jump from an intuitive 1/2 to a rather favourable 2/3. When Monty Hall plays his game, he has *already* decided he's going to reveal a non-winning door. The less favourable outcome has been removed and our chances improve.

DavidSpeyer in the comments above also uses this interpretation to make his point.

By the way, whether I wait for a favourable game of two-up that shows at least one tail rather than at least one head makes no difference. In fact, I can even change my mind between games! The important thing is that I make my decision before the game, and then *don't play* if the game doesn't match my decision. In the long run, the more favourable probability will bear true.

Posted by: Heath Raftery | February 12, 2010 6:45 PM

I hope you don't have a problem with long comments - this definitely isn't a tweet.

First, I'd like to apologize for the last sentence in my previous comment. You are right that these simple problems are not intuitive. The human brain was not primarily designed for logic but rather for socializing and surviving. Any logic outside of common experience seems unnatural, and those for whom logic seems a little more natural, social skills tend to come less naturally. That said, it still seems wrong that so many who are apparently highly educated in logic and probability still disagree with you on more than just semantics (fortunately there are also those who realize your logic and Probability aren't at fault).

You are also right about people adamantly sticking to a conclusion they have come to. They will do so even when they would instantly realize their error if they could be objective about it. I think this is a defense mechanism to protect our egos. We all have them, and I think they all need protecting. Blocking out the knowledge of an error is one protective mechanism. A way to combat this problem is to view the ability to learn from ones mistakes as a very positive thing. So admitting a mistake doesn't necessarily undermine the ego, but instead can give it a boost. This doesn't necessarily work well in all situations. In particular, it tends to fail when one is under attack and is belittled for a mistake. This is one reason for my apology.

Getting back to the problem, reading the 2 comments before mine by Justin Case and Chris, it seems clear they do not understand the phrase "given that". Alan Cooper's comment that follows mine points to David Speyer's from Feb. 2007, and your response to it. Alan said these "pretty much nailed this three years ago". I have to disagree (the first paragraph of his summary is better). David only addresses the meaning of "given that" and ignores the ambiguity of "one is a". And David's last paragraph may have led to (or reinforced) your specific confusion (*see note at end). Reading your original post, it seemed you perfectly understood the phrase "given that" (rereading it - especially the last paragraph - I can see the confusion was probably already present). But in your response to David on 2-18-07 you say:

"Quite simply, if one treats the word "given" in the problem as the equivalent of the pipe symbol in P(A|B), then it's simple - 2/11. If one considers the reality of the situation described, then there is a leap of faith to assume that non-complying scenarios are being disregarded."

This suggests you do not really understand the phrase "given that". In short, "given that (or given only that - as I think it should be stated) A, what is the probability of B" is equivalent to "what is P(B|A)". And there is no leap of faith to assume that non-complying scenarios are being disregarded as there are no non complying scenarios. There is but one event, and we know that for that event A is true. By coincidence, P(B|A) could equal P(B), or it could equal P(B|C), but this is ONLY by coincidence and they are not the same probability. They just have the same value. The probability P(B|A) applies also to all events matching the event in question and for which the information A is known and nothing else. If you are given different information, then this probability does not directly apply even if the math for calculating the new probability is the same and gives the same result.

Here is my opinion on the subject in much greater detail.

The only ambiguity in the phrase "given that" can be removed by saying "given only that" (or, if one is a stickler, "given that and only that"). I see no possible interpretation of the restricted version other than what Alan refers to as the "standard technical sense of conditional probability", and what you call "the equivalent of the pipe symbol in P(A|B)". The unrestricted version is practically meaningless.

With the unrestricted version of "given that A, what is the probability of B", then any time A is true, then the condition is met even if we also know other information (even information such as B is true or B is false). So, knowing that a particular one is a 4 means that at least one is a 4, so the answer to "given that at least one is a 4, what is the probability of a total of 7" could be 1/6. But if we see that there is a 4 and a 2, "at least one is a 4" is still true, so the answer could be 0. If we see that there is a 4 and a 3, "at least one is a 4" is still true, so the answer could be 1. If we see that there is a 4 and we somehow get the information that the other one has a probability p of being a 3, "at least one is a 4" is still true, so the answer could be any possible probability p. So with the unrestricted version of "given that" any answer can be correct. This is true whenever the unrestricted version is used except when A directly implies B in which case the answer can only be 1, or A directly implies NotB in which case the answer can only be 0. Clearly no one is interpreting it this way, and I see only one other possible interpretation and that is the restricted version "given only that".

So, to calculate your probability you have only the description of the event (in this case: a pair of dice - assumed to be fair - is thrown) and the information A from the phrase "given only that A". In this case A is "one is a 4" which is ambiguous, and the root of the problem.

As an aside, I think "given that" should never be used. While it is never interpreted as the wrong one of the only 2 possible interpretations, it still causes confusion. It does so mainly because the literal interpretation is the wrong one. When used it should be explicitly stated that it is being used as a short hand for "given only that". But, creating a special short hand such as an acronym ("G.O.T." or "g.o.t" for example) and explicitly defining that as "given only that" would be even shorter, and I think should be less confusing.

Reading your answer to Alan Cooper, I think I see your specific confusion. If I'm right and you keep an open mind while reading the rest of this, I hope I can remove the confusion for you. You say:

"Good summary, but even the phrase "given that at least one die" doesn't capture the requirement that scenarios where this did not happen have been thrown out. I could still interpret "given that at least one die" as being an observation after two dice are rolled rather than a precondition on the roll. But I see your point about it suggesting a particular technical interpretation."

The phrase "given only that" is not a precondition for an event. It is rather the result of some type of observation. It is however, a precondition to the calculation of the probability. The dice were thrown; an observation was made; you obtain exactly the information "at least one is a 4"; finally, based on everything you know, you calculate the probability of the sum of the dice being 7. It could also be a precondition to the event, in which case you can calculate the conditional probability before throwing the dice, but this would not generally be the case. I will assume from now on that it is not a precondition to the event.

Since it is a precondition to your calculation, during the calculation, you can throw out all possible events that don't match the information you have. This is also true if you are planning a simulation - Whether it be a computer simulation or a real world simulation where you are throwing physical dice. This is also true whether you are given only the information "a particular one is a 4" or "at least one is a 4" (the 2 of course will give different answers).

So, if the only difference between "a particular one is a 4" and "at least one is a 4" is the result each gives, why do you think that the second can only be a precondition to the event? For you, I believe this is the paradox. Half of the answer, I think, is that you are confusing a simulation with the original event. So why can you see "one particular one is a 4" as an observation? For this particular problem, "one particular one is n" where n is between 1 and six (inclusive) meets 2 special criteria: it describes one of 6 exclusive and complete sets; and all the sets it describes give the save probability. This allows you to look at a simulation and say nothing has to be thrown out. Regardless of n the problem remains the same. And, for any simulation, it will always meet one of the conditions and only one. So by running 6 simulations simultaneously, you don't need to throw anything out. And since they are all mathematically equivalent, it's really like running one big simulation without throwing anything out. This is only possible because none of the 6 conditions changes the probability you had before getting any information. Since you throw nothing out, you see no precondition to the simulation. But you are actually performing 6 simulations with 6 different preconditions. They just happen to be alike mathematically and happen to give the same result.

Let's say that instead of a 7 we were looking for a nine. "one particular one is n" still describes one of 6 exclusive and complete sets. But now the answer depends on n. If n is 3 or more, the probability is still 1/6. But if n is strictly less than 3, then the probability becomes 0, and the preconditional probability is 1/9. Now we are ready for the second half of the reason for your paradox.

A pair of dice is thrown outside your view and someone with no number bias yells out "at least one is as 4". The probability of getting a seven in this case is 1/6. This is true even though the phrase "at least one is as 4" was used instead of "one particular one is a 4". But "at least one is as 4" is supposed to give a probability of 2/11! How is this possible? Simply put, you did not get exactly the information "at least one is a 4". You also know that if another number is present the person would have been just as likely to have called that other number. Which means that in your calculations or simulations you need to throw out all pairs that do not include a 4, AND you also have to throw out half of all pairs that include exactly one 4.

So, if someone yelling out "at least one is as 4" isn't how you get the information "at least one is as 4", how do you get the information? Problems using "given" (such as the one your lecturer gave you) do not say how you get the information, and as far as the problems are concerned, it is totally and completely irrelevant! That said, searching for the answer can be enlightening, and is well worth the effort (and never doing so will likely lead to errors when real world problems are encountered). I think you have the answer, but you seem to have worded it in terms of a simulation where it is a precondition. The simplest way would be to ask an informant (someone who can see the dice and who won't lie to you) "is at least one 4 showing?"

To make things clear, let's look at a particular event. We have a pair of dice of which only one is green. The dice are thrown out of your sight and the green one shows a 4. And you have an informant who can see the dice.

- You only ask the informant "what number is the green die showing?" He will answer "4".

- You only ask the informant "is the green die showing a 4?" He will answer "yes".

Both of these situations can be written as "given only that the green one is 4", and your probability for getting 7 (or 9) is 1/6.

- You only ask the informant "is at least one a 4?" He will say "yes".

This situation can be written as "given only that at least one is a 4", and your probability for getting 7 (or 9) is 2/11.

(if you are wondering how the same event can have different probabilities for the same outcome, I believe this is called the observer's paradox or something similar. In a nutshell, once the dice are thrown, either a total of seven is showing, or it is not. So the probability is either 1 or it's 0, and the informant knows which it is. But your information is restricted. Before getting any information from the informant, the probability based on the information you have is 1/6 to get 7 and 1/9 to get 9. After finding out from the informant that the green die shows a 4, the probability is now 1/6 to get either a 7 or a 9. After observing both dice for yourself, the probability becomes 1 or 0 (whichever the case may be) for you as well. And at no time did the numbers on the dice ever change. So the probability depends on the information available about the event. This means that different 'Observers' of the same event can have different probabilities for a given outcome if they have different information about the event - each either 'observed' only a part of the event (as poker players), or got second hand information and didn't directly 'observe' it. Here, if you only ask the informant "is at least one a 4?", then you do not know that the green one is a 4 and the probability based on the information you have is 2/11.)

In that last situation above, the information "at least one is a 4" was obtained as a result of observation, and did not rely on any precondition. But I suspect you may have one final question: what if there hadn't been a 4? The idea behind this question would be that since you couldn't get the information "at least one is a 4", the information had to be a precondition. But the event already took place. It is a single event and it is over. The outcome has already been determined, and we found out that there was in fact at least one 4. "What if" is irrelevant once the event is in the past and the outcome has already been determined. What if only matters in simulation. In a simulation you cannot guarantee the outcome , and so you need a precondition, and all the events whose outcome does not meet the specified criteria are thrown out. And all of this applies equally to "a particular one is a 4".

But, for the sake of argument, and because it might help remove any vestige of confusion, here is the answer. This time lets look at an event identical to the last except that the green die is a 2 and the other is not a 4.

- You only ask the informant "what number is the green die showing?" He will answer "2". Your information is "the green die is a 2" and your probability to get 7 is 1/6, and to get 9 is 0.

- You only ask the informant "is the green die showing a 4?" He will answer "no". Your information is "the green die is not 4" and your probability to get 7 is 1/6 (5 options that give 7 out of 30), and to get 9 is 1/10 (3 options that give 9 out of 30).

- You only ask the informant "is at least one a 4?" He will say "no". Your information is "neither die is a 4" and your probability is 4/25 for a 7 and 2/25 for a 9.

In short, the phrase "given only that A" means that A is an observed fact about the event, but it is a precondition to any simulation to test the probability of an outcome for the event.

I hope this did the trick. If not.... Well, that would explain why I don't teach.

Now, let's get back to the meaning of "one is a 4". I believe I was wrong last time. Thinking it over, I don't see "at least one is a 4" as a possible meaning. It has only 2 possible meanings: "exactly one is a 4"; and more commonly "one particular one is a 4". So that means your lecturer was just plain wrong... Or does it?

In this context, "one" is a pronoun. "One" refers to a single member of a previously mentioned group. Here the group is the pair of dice, and "one" would refer to one of the dice. "one is a 4" differentiates the dice such that we can talk about "the other". "At least one is a 4" does not and "the other" has no meaning. So the 2 phrases are not equivalent. "one is a 4" or "a particular one is a 4" implies that "at least one is a 4" but "at least one is a 4" does not imply that any particular one is a 4. No matter which particular one a person might choose, "at least one is a 4" allows that one to be something other than a 4. So "at least one is a 4" does not imply "one particular one is a 4". Looking at it in terms of sample space, "at least one is a 4" restricts it to 11 possibilities while "a particular one is a 4" restricts it to 6. And, those 6 are included in the 11. So "a particular one is a 4" is more restrictive (more information) and implies "at least one is a 4".

Now, for the paradox which I believe is the root of almost everyone's trouble with this problem, the "glimpse a Heart" problem that Alan Cooper mentioned, and Marilyn Vos Savant's problem with 2 children "at least one of whom is a boy" (but not the Monty Hall problem). Whenever at least one is a 4, the statement "one is a 4" will always be true. So, "at least one is a 4" does imply "one is a 4". Since we already know that the reverse implication is valid, they must be equivalent. Since "one is a 4" is equivalent to "a particular one is a 4", then all 3 statements must be equivalent. Apparently an awful lot of people believe this to one degree or another.

The problem is that "one is a 4" can have 2 different meanings. If there is a red die and a green die, "one" could refer to the red one or to the green one. "At least one is a 4" does not imply either meaning, it only implies that at least one of the possible meanings is true. since we don't know a priori which meaning is intended, after the fact we can always pick one that is true. For 2 statements that mutually imply each other to be equivalent, they both have to be unambiguous.

The lecturer could argue that "given only that one is a 4" means that any time it would be truthful to say "one is a 4" the condition is met, which would mean that there is no more information than "at least one is a 4". I would interpret it the same as you did: the condition is true only if one of the 2 possible meanings is met. This means it is ambiguous, but calculating the probability with either meaning is identical and gives the same result, so we don't need to know which meaning is intended and the information is valid. If you had a normal die with 6 square sides along with either a die with 4 triangular sides (for this type of die, no face is pointing straight up, and the one facing down would have to be the one to count) or a die with 12 pentagonal sides, then "one is a 4" or "a particular one is a 4" would not be information that could be used without knowing which was meant or having a probability for each meaning (with 2 die the default assumption would be a 50% chance for each meaning - giving probabilities of 5/24 for a 7 and 1/12 for a 9). "At least one is a 4" still has unambiguous meaning. For a 6 sided die with a 4 sided die, the probability for a 7 would be 2/9 and for a 9 would be 1/9.

While I disagree with the lecturer's interpretation, I cannot emphatically state that it is wrong. However, it perpetuates the notion that "at least one" is the same as "a particular one", and for that reason he should definitely not use it. His question should have been " given only that at least one is a 4, what is the probability for the sum to be 7?" Stated this way there is no ambiguity, and the answer is 2/11. Actually, stating it as he did could be useful, but only if it was to point out the paradox AND explain it. Without explanation it only creates confusion. An instructor should be trying to enlighten rather than confuse. Since he refused even to discuss the issue, I suspect he himself was confused. He himself was probably taught by someone like himself.

So when Alan Cooper is tempted to say "We have to make a choice of interpretation, and in technical discussions we want to be sure that everyone is making the same choice. You were told the definition to assume so get over it". I am tempted to say "I completely agree with the first sentence, which is precisely why unnecessarily ambiguous or confusing phrases should never be used. so, get over it? I don't think so! And if I am told that the definition of 'red' is black, should I just accept that and 'get over it'?" But Alans next 2 paragraphs are to the point.

Many many years ago, ASU's physics department held weekly colloquia (they probably still do). One week the speaker was a researcher whose research was not physics per se, but rather the teaching of physics. He had made a very disturbing discovery. He had devised a test in very basic physics. A test that required no math. To his great surprise, top high school physics students from around the country were failing on his test! His conclusion was that these students had been taught how to use equations and math. "Word" questions were somewhat standardized such that simple rules could be applied to convert them to appropriate equations. But their understanding of the underlying physics was abysmal.

I get the impression that instructors such as the lecturer you have issues with are creating the same type of problem in logic and probability. The math part is taught, and the rules for converting standardized questions into mathematical terms is also taught, but the underlying logic and the probability of real situations are ignored as much as they can be.

This would make a good ending, unfortunately, there are a few more 'short' comments I'd like to make.

First, the programmers who think that their simulations can unequivocally resolve any dispute are ones who's simulations I will always distrust. A simulation cannot resolve a dispute about the meaning of a question. When they write their simulation program they are assuming one of the meanings at the onset. They obviously don't realize this or they wouldn't think they are resolving the dispute. So any simulation they write might not be for the intended situation, and none of their simulations can be trusted. This would apply to Steph who commented on 2-19-07.

Finally, a couple comments about your response to Alan Cooper.

Nothing you say is specifically wrong, except that the critical difference is to choose in advance whether to bet or not. Here you are talking about a winning strategy instead of the probability of a particular event. Decisions you make in advance have no effect on the outcome of the event and have no effect on any probabilities.

On 4-12-07 you say:

"Don't let the fact that they are the same number lead you to believe that I'm disregarding the new information. As Tony cleverly pointed out - if I was rolling for a total of 3, and I roll a 4 first, my chances of getting a 3 total have plummeted! But, if I'm rolling for an 8 total, my initial likelihood is 5/36 - if a 1 comes up I'm buggered, but if a 2, 3, 4, 5 or 6 comes up, I would then have a 1/6 chance of getting my total."

This shows that you clearly understand the concept. The reason you can't use the same logic with "at least one is a 4" is, I think, due to how your mind tried to resolve the confusion caused by the ambiguity of the meaning of "one is a 4". Let us say you are going for a total of 8. If you learn that a particular die shows a 4, you throw out all possibilities where that die does not show a 4 and you get 1/6 (up from the 5/36 you started with). In exactly the same way, if you learn that at least one is a 4, you throw out all possibilities that do not have at least one 4 (i.e. you throw out all that have no 4) and your probability is 1/11 (notice that the probability has dropped instead of increased - That is because with at least one 4, an 8 requires a double - This is why doubles are prized). This is because the information is a precondition to the calculation of the probability.

Now, if you see "at least one is a 4" as necessarily being a precondition to the event because the probability is different when "at least one is a 4" isn't true, and so those cases have to be thrown out; then you also have to see "a particular one is a 4" as a precondition because "you have to decide in advance" to throw out all cases where the particular one was a 1 to get the same probability. But it's not really a matter of precondition to the event. When you find out (by whatever means) that at least one is a 4 you have a certain probability. When you find out that you have no 4 you have a different probability. Because of the difference, if you decide to bet in one case and not in the other, you can create a winning strategy. The same is true for "a particular one is a 4" or "a particular one is a 1".

I must concede though that there is one small difference between the 2 types of information. "A particular one is a 4 (or a Heart, or whatever)" is very easy to implement, and you have described a number of ways to do so. The simplest is by direct observation: the particular die simply falls within your line of sight. "At least one is a 4 (or a Heart or whatever)" requires knowledge about both dice (or both cards, or all elements) without specific knowledge about either die (either card, or any element). I do not believe it is possible to get this information directly. The only way I see to get the information accidentally is to overhear (This would include 'reading', if the people 'overheard' are using signs or are communicating in written form) someone else who asked for the information.

Back to your response to Alan, you do not deal much with the "glimpse a Heart" problem in terms of actually solving it. It does seem clear that you get the correct answer - glimpsing the Heart does not improve the odds. However your discussion of the two-up game suggests you may have the wrong reason. Glimpsing a Heart or a Head, gives the information "a particular one is a Heart" or "a particular one is a Head" (I believe that anyone with a normal understanding of the english language would say that you know that "one is a Heart" or "one is a Head" - or glimpsing a 4 would give you "one is a 4" with the dice). It does NOT give exactly the information "at least one is a ..." You never point this out, and instead suggest that you need to decide in advance "I'm not going to bet unless at least one of the coins comes up heads". This leaves one thinking that glimpsing the coin does give the desired information, you just need to decide in advance to bet if a Head shows and not when a Tail shows or vice versa. This is not a winning strategy (unless betting all the time is a winning strategy because maybe you are the house playing by house rules?). I hope this is simply an oversight in your explanation and not an actual confusion on your part.

I hope that covers it, and I hope I haven't been too ponderous repeating myself. And I hope I haven't made this too loooooonng!

Happy reading! (I hope - apparently I do a lot of that)

*added note.

To address David Speyer's last paragraph directly, that a head appears is not the precondition that David is pointing out. The precondition is rather one about the informant. Without knowing how the informant is making decisions, you can't know what information you are getting. If you are asking questions, you need to know that the informant will always give the simplest and most direct answer (by choosing when to give extra information, he or she could change the probability even for an event where a simple and direct answer is given). If you are not asking a question, and the informant is choosing what information to give, then how the decision is made could be more important than what is said. David pointed out a precondition about the informant (the "maniac") that will give "at least one is a 4". To get the probability of 1/6 when an informant is always saying "one is a ..." or "at least one is a ...", you need to know that the informant has no number bias. If the largest number is always chosen, then when 4 is chosen, the probability for a 7 is 1/4. If the smallest number is always chosen, then the probability is 0. So, the precondition allows us to get the desired information and is not even necessarily a precondition to the event. It is specifically a precondition to obtaining the information.

Posted by: john | March 17, 2010 3:51 AM

Excellent contribution John! My delay in replying is not because I was put off by the length of your reply, but instead instead a lengthy reaction that has bounced from certainty to disturbing confusion and from general agreement to strong argument and from enlightenment to worrisome disillusionment!

I feel now I have some constructive remarks.

You split the question into two fundamental statements - one based on "given that" and the other on "one shows". This was awfully useful for deconstructing where some of the confusion lies, but on reflection I see that the separation is not useful for comprehension. Ultimately the meaning of the question relies on understanding the situation posed. The situation posed in this case is that we are "given that one die shows". This is not enough information to unambiguously determine what actually happened, so we are left to make some assumptions.

Fundamentally I made the assumption that our informant (the person that "gives" us the extra information) took a look at the rolled dice and told us what was on one of the dice. In that case the statement "one of the dice" implies "at least one of the dice", because telling us what is on one of the dice does not place a restriction on the other. As you say, "one" is a pronoun in this case, not a cardinal. The statements "exactly one" or "only one" would be describing a different situation that cannot be naturally inferred from the original statement.

With this assumption made, logical inference would lead us to imagine that the informant need only have looked at one of the dice to provide us with the information. Therefore, I imagine a situation where an informant, who has complete information about the dice roll, chooses one of the dice and informs us of what it is showing. It is not difficult to show that in this situation, having made a decision about which dice to reveal, the remaining die still has a 1 in 6 chance of having any of the 6 possible values. The probability of a total of 7 is therefore 1 in 6.

The alternative to my assumption is to imagine that the informant is making a statement about the overall result of the roll. That is, they are not choosing to reveal the value of one of the dice, but instead are checking both dice to see if "one is a 4" can truthfully be stated. But you see, this is what I meant by a precondition - the decision to look for a '4' had to be made before checking the dice. If it had not been made, then why would you look beyond the first die? Either the informant wants to reveal the value of one of the dice (my assumption) and it doesn't matter which one they pick - just that they pick one, or they have a precondition and are looking to both dice to validate it. With this assumption made we can actually imagine two outcomes - either there is a 4 showing, or there is not. If there is, then yes, the odds are better at 2 in 11 for a total of 7, but if there's not then the odds are only 4 in 25.

You offer some evidence that my insistence on a "precondition" is flawed. I see where you're coming from - there's no actual reason why my interpretation of events implies no precondition and vice-versa. It just occurred to me as likely in reality. After much furrowing of the brow, I see that my interpretation is not limited to situations where no precondition is made. The precondition argument illustrates the problem, but is not a necessary or sufficient condition! Let's use an example that doesn't have the added confusion of probabilities that happen to equate numerically.

Consider the "probability of a total of 10 on two dice given that one is a 4". There are four interpretations:

"Ignorant" P(total is 10) = 1/12

"Particular" P(total is 10 | a particular one is a 4) = 1/6

"At least" P(total is 10 | at least one but not one in particular is a 4) = 2/11

"Exactly" P(total is 10 | exactly one of the two is a 4) = 1/5

The preceding paragraphs give a practical methodology for why I went with the "Particular" interpretation. The 21000 words that compose this discussion so far give some evidence that no interpretation is universally clear. Your scenarios involving questioning an informant give plausible scenarios that imply each interpretation. I like the intention behind your "G.O.T." suggestion, although I think the wording could still use work!

What is clear is that our assumptions are blindingly subconscious and stunningly significant.

To wrap up this post, I have three postscripts:

1. Seeking some resemblance to a "pretty" mathematical proof, I was vicious in chopping all that I wrote into what remains - I have forgone a verbose narrative for what I hope is the essence!

2. The "programmers who think that their simulations can unequivocally resolve any dispute" that you refer to are indeed noise. I'm a big fan of the Monte Carlo method, but like most of statistics it proves nothing more than what the initial conditions encode. Reality is usually more complex.

3. I was quite wrong about two-up! A classic case of ignorantly drawing a long bow at the end of a post. Thank you for pointing that out - the shock to my assumptions was just what I needed to challenge them.

Posted by: Heath Raftery | April 5, 2010 12:45 AM

Alternatively, the lecturer's answer is correct had somebody rolled the dice and then was asked whether the rolled dice had at least one 4. If the dice do in fact have a four, then we're properly in the 2/11 range of probability. Because, like rerolling until you have a four, such a situation would properly exclude any roll lacking a four.

So if you roll and say I rolled at least one four. The answer is 1/6. But, if you roll and somebody asked if you rolled at least one four. The answer is 2/11.

Posted by: Tatarize | August 7, 2010 3:38 AM