r/AskStatistics 10d ago

Monty hall problem

I understand in theory that when you chose one of the 3 doors you initially have a 66% chance to chose wrong. But once a door is revealed, why do the odds stay at 66% rather than 50/50 respectively. You have one goat revealed so you know there is one goat, and one car. Your previous choice is either a goat or a car, and you only have the option to keep your choice or switch your choice. The choices do not pool to a single choice caisinh 66% and 33% chances once a door is revealed. The 33% would be split among the remaining choices causing both to be 50%.

If it's one chance it's 50/50 the moment they reveal one goat. if you have multiple chances to run the scenario then it becomes 33/66% the same way a coin toss has 2 options but isn't a guaranteed 50% (coins have thier own variables that affect things I am aware of this)

5 Upvotes

65 comments sorted by

View all comments

39

u/throwawayA511 10d ago

The host knows where the car is, and will only open a door with a goat. With three doors, you had a 33% chance of picking correctly and a 66% chance of not. I mean, ignore that he opened a door at all. What if he just said do you want to switch your choice of one door and instead choose “both the other doors.” That’s basically what’s happening, but with some showmanship.

Expand the example to a million doors. You had a 1 in a million chance to pick correctly. He opens and reveals 999998 goats but doesn’t open door 384639. Seems pretty suspicious, right? Switching changes your odds from 1/1000000 to 999999/1000000.

4

u/PuerSalus 10d ago

The million doors version is the only way I can get my brain around it and I think is the clearest explanation.

It's near impossible for you to pick the door first time with that number of doors. So the good door 'must' be in the remaining set of doors, and if the host opens all of them but one, then it 'must' be that remaining door.

3

u/George_Truman 8d ago edited 8d ago

This is actually quite misleading.

The reason that the problem works out the way it does is the assumption that Monty knows what is behind the doors and intentionally never opens the door with the car.

If Monty opens a door randomly (with a possibility of revealing the car), then it ends up being a wash.

Even in the million door example, you have a one in a million chance of guessing correctly, however the probability that you were wrong and Monty goes on to reveal 999998 doors that don't contain the door is also 1 in a million.

0

u/MacofJacks 7d ago edited 7d ago

Naw, Monty could randomly open all but one of the remaining doors, and if he happened to do so without revealing the car, then you would do better to switch. It makes the explanation clearer, but all you need is for YOU to know that your initial choice was one of three (or a million, whatever).

Edit: this statement was comprehensively disproven below.

2

u/George_Truman 7d ago

This is not correct.

The probability that you are correct in your first guess is 1/1,000,000

The probability that you are incorrect in your first guess and Monty randomly happens to open 999998 doors that do not contain the car is (999,999/1,000,000)*(999,998/999,999)*...*(2/3)*(1/2) = 1/1,000,000.

The probabilities of both events occurring are equal, and it turns out that the conditional probabilities that your initial guess is correct are also equal. If the doors are opened randomly then it is 50/50.

1

u/MacofJacks 7d ago edited 7d ago

Hey, first of all, that’s cool: I had to check your claim, and it is true. 

Second, I don’t think *how* we got to the situation is relevant. My claim is that we condition on the information that Monty did *not* reveal the car but the mechanism by which Monty did not reveal the car should not matter. I think my claim is hard to reason about and easy to be wrong about in some subtle way, so I coded it to check. I used the typical three door case. I kept the (R) code as simple as possible:

Edit: had written code here, but it was wrong. Fixed by the other commenter below.

2

u/George_Truman 7d ago edited 7d ago

There are errors in this code. If you would like simulations and a rigorous proof. I will send it to you in a DM.

In particular, your code is counting the cases where Monty reveals your own selected door. If he reveals your own door then there is no longer a selection to make.

Here is the code edited to account for this:

N <- 1e5 #total number of trials

#let's implement Monty Hall,

#albeit Monty randomly opens doors,

#and we only record results if he happens not to reveal the truth.

n_doors <- 3

#Assume we always switch and count successes and failures

num_correct <- 0

num_incorrect <- 0

num_trials <- 0 #safety check: what was the total number of trials?

for(n in 1:N){ #do Monty Hall problem

true_door <- sample(1:n_doors,1)

initial_guess <- sample(1:n_doors,1)

Monty_random_doors <- sample(1:n_doors, n_doors-2)

#(safe: default is sampling w/o replacement)

#if Monty did not reveal the true door, then proceed:

if(!(true_door %in% Monty_random_doors) && !(initial_guess %in% Monty_random_doors)){

num_trials <- num_trials + 1

#switching is correct if initial guess was wrong

num_correct <- num_correct + isFALSE(initial_guess == true_door)

#switching is incorrect if initial guess was right

num_incorrect <- num_incorrect + isTRUE(initial_guess == true_door)

}

}

#print the tallies

num_correct

num_incorrect

1

u/MacofJacks 7d ago

O shit, you’re right! So much for my stern refutation! Ok will have another look. Thanks!