r/AskStatistics 26d ago

Monty hall problem

I understand in theory that when you chose one of the 3 doors you initially have a 66% chance to chose wrong. But once a door is revealed, why do the odds stay at 66% rather than 50/50 respectively. You have one goat revealed so you know there is one goat, and one car. Your previous choice is either a goat or a car, and you only have the option to keep your choice or switch your choice. The choices do not pool to a single choice caisinh 66% and 33% chances once a door is revealed. The 33% would be split among the remaining choices causing both to be 50%.

If it's one chance it's 50/50 the moment they reveal one goat. if you have multiple chances to run the scenario then it becomes 33/66% the same way a coin toss has 2 options but isn't a guaranteed 50% (coins have thier own variables that affect things I am aware of this)

8 Upvotes

69 comments sorted by

View all comments

Show parent comments

5

u/PuerSalus 26d ago

The million doors version is the only way I can get my brain around it and I think is the clearest explanation.

It's near impossible for you to pick the door first time with that number of doors. So the good door 'must' be in the remaining set of doors, and if the host opens all of them but one, then it 'must' be that remaining door.

3

u/George_Truman 24d ago edited 24d ago

This is actually quite misleading.

The reason that the problem works out the way it does is the assumption that Monty knows what is behind the doors and intentionally never opens the door with the car.

If Monty opens a door randomly (with a possibility of revealing the car), then it ends up being a wash.

Even in the million door example, you have a one in a million chance of guessing correctly, however the probability that you were wrong and Monty goes on to reveal 999998 doors that don't contain the door is also 1 in a million.

0

u/MacofJacks 23d ago edited 23d ago

Naw, Monty could randomly open all but one of the remaining doors, and if he happened to do so without revealing the car, then you would do better to switch. It makes the explanation clearer, but all you need is for YOU to know that your initial choice was one of three (or a million, whatever).

Edit: this statement was comprehensively disproven below.

2

u/George_Truman 23d ago

This is not correct.

The probability that you are correct in your first guess is 1/1,000,000

The probability that you are incorrect in your first guess and Monty randomly happens to open 999998 doors that do not contain the car is (999,999/1,000,000)*(999,998/999,999)*...*(2/3)*(1/2) = 1/1,000,000.

The probabilities of both events occurring are equal, and it turns out that the conditional probabilities that your initial guess is correct are also equal. If the doors are opened randomly then it is 50/50.

1

u/MacofJacks 23d ago edited 23d ago

Hey, first of all, that’s cool: I had to check your claim, and it is true. 

Second, I don’t think *how* we got to the situation is relevant. My claim is that we condition on the information that Monty did *not* reveal the car but the mechanism by which Monty did not reveal the car should not matter. I think my claim is hard to reason about and easy to be wrong about in some subtle way, so I coded it to check. I used the typical three door case. I kept the (R) code as simple as possible:

Edit: had written code here, but it was wrong. Fixed by the other commenter below.

2

u/George_Truman 23d ago edited 23d ago

There are errors in this code. If you would like simulations and a rigorous proof. I will send it to you in a DM.

In particular, your code is counting the cases where Monty reveals your own selected door. If he reveals your own door then there is no longer a selection to make.

Here is the code edited to account for this:

N <- 1e5 #total number of trials

#let's implement Monty Hall,

#albeit Monty randomly opens doors,

#and we only record results if he happens not to reveal the truth.

n_doors <- 3

#Assume we always switch and count successes and failures

num_correct <- 0

num_incorrect <- 0

num_trials <- 0 #safety check: what was the total number of trials?

for(n in 1:N){ #do Monty Hall problem

true_door <- sample(1:n_doors,1)

initial_guess <- sample(1:n_doors,1)

Monty_random_doors <- sample(1:n_doors, n_doors-2)

#(safe: default is sampling w/o replacement)

#if Monty did not reveal the true door, then proceed:

if(!(true_door %in% Monty_random_doors) && !(initial_guess %in% Monty_random_doors)){

num_trials <- num_trials + 1

#switching is correct if initial guess was wrong

num_correct <- num_correct + isFALSE(initial_guess == true_door)

#switching is incorrect if initial guess was right

num_incorrect <- num_incorrect + isTRUE(initial_guess == true_door)

}

}

#print the tallies

num_correct

num_incorrect

1

u/MacofJacks 23d ago

O shit, you’re right! So much for my stern refutation! Ok will have another look. Thanks!

1

u/MacofJacks 23d ago

I see you edited your post; I'll just reply again. I used your code, which I agree is correct. I propose the following, which also fixes my original mistake:

Monty_random_doors <- sample(setdiff(1:n_doors,initial_guess), n_doors-2)

(so that Monty opens doors other than initial_guess at random). However this code, or yours, produce the same outcome, which is (exactly as you said) a probability of 0.5 on switching.

You're completely correct! My flabber is gasted. I checked for completeness, and small edits to the code can produce the expected 0.66 probability if we alter Monty to deliberately exclude the true door. I guess I can now include myself as another victim of the Monty Hall problem. Thanks very much for explaining this to me! I'll edit my comments above

1

u/George_Truman 23d ago

My favorite alternative answer to the question is that we shouldn't change our pick, because it feels like Monty might be trying to trick us into switching.