<span class="c1">#!/usr/bin/env ruby -w</span>

<span class="c1"># Monte Carlo simulation for the Monty Hall Problem:</span>

<span class="c1"># http://en.wikipedia.org/wiki/Monty_Hall_problem</span>

<span class="cm">=begin</span>

<span class="cm">When using a Ruby version older than 1.8.7</span>

<span class="cm">define the following two methods:</span>

<span class="cm"> class Array</span>

<span class="cm"> def shuffle</span>

<span class="cm"> self.sort_by { rand }</span>

<span class="cm"> end</span>

<span class="cm"> </span>

<span class="cm"> def choice</span>

<span class="cm"> self.shuffle.first</span>

<span class="cm"> end</span>

<span class="cm"> end</span>

<span class="cm">=end</span>

<span class="c1"># Utility class for the simulation of a single Monty Hall game.</span>

<span class="k">class</span> <span class="nc">MontyHall</span>

<span class="k">def</span> <span class="nf">initialize</span>

<span class="vi">@doors</span> <span class="o">=</span> <span class="o">[</span><span class="s1">'car'</span><span class="p">,</span> <span class="s1">'goat'</span><span class="p">,</span> <span class="s1">'goat'</span><span class="o">].</span><span class="n">shuffle</span>

<span class="k">end</span>

<span class="c1"># Return a number representing the player's first choice.</span>

<span class="k">def</span> <span class="nf">pick_door</span>

<span class="k">return</span> <span class="nb">rand</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>

<span class="k">end</span>

<span class="c1"># Return the index of the door opened by the host.</span>

<span class="c1"># This cannot represent a door hiding a car or the player's chosen door.</span>

<span class="k">def</span> <span class="nf">reveal_door</span><span class="p">(</span><span class="n">pick</span><span class="p">)</span>

<span class="n">available_doors</span> <span class="o">=</span> <span class="o">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="o">]</span>

<span class="n">available_doors</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="n">pick</span><span class="p">)</span>

<span class="n">available_doors</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="vi">@doors</span><span class="o">.</span><span class="n">index</span><span class="p">(</span><span class="s1">'car'</span><span class="p">))</span>

<span class="k">return</span> <span class="n">available_doors</span><span class="o">.</span><span class="n">choice</span>

<span class="k">end</span>

<span class="c1"># Return true if the player won by staying</span>

<span class="c1"># with their first choice, false otherwise.</span>

<span class="k">def</span> <span class="nf">staying_wins?</span><span class="p">(</span><span class="n">pick</span><span class="p">)</span>

<span class="n">won?</span><span class="p">(</span><span class="n">pick</span><span class="p">)</span>

<span class="k">end</span>

<span class="c1"># Return true if the player won by switching, false otherwise.</span>

<span class="k">def</span> <span class="nf">switching_wins?</span><span class="p">(</span><span class="n">pick</span><span class="p">,</span> <span class="n">open_door</span><span class="p">)</span>

<span class="n">switched_pick</span> <span class="o">=</span> <span class="p">(</span><span class="o">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="o">]</span> <span class="o">-</span> <span class="o">[</span><span class="n">open_door</span><span class="p">,</span> <span class="n">pick</span><span class="o">]</span><span class="p">)</span><span class="o">.</span><span class="n">first</span>

<span class="n">won?</span><span class="p">(</span><span class="n">switched_pick</span><span class="p">)</span>

<span class="k">end</span>

<span class="kp">private</span>

<span class="c1"># Return true if the player's final pick hides a car, false otherwise.</span>

<span class="k">def</span> <span class="nf">won?</span><span class="p">(</span><span class="n">pick</span><span class="p">)</span>

<span class="vi">@doors</span><span class="o">[</span><span class="n">pick</span><span class="o">]</span> <span class="o">==</span> <span class="s1">'car'</span>

<span class="k">end</span>

<span class="k">end</span>

<span class="k">if</span> <span class="bp">__FILE__</span> <span class="o">==</span> <span class="vg">$0</span>

<span class="no">ITERATIONS</span> <span class="o">=</span> <span class="p">(</span><span class="no">ARGV</span><span class="o">.</span><span class="n">shift</span> <span class="o">||</span> <span class="mi">1_000_000</span><span class="p">)</span><span class="o">.</span><span class="n">to_i</span>

<span class="n">staying</span> <span class="o">=</span> <span class="mi">0</span>

<span class="n">switching</span> <span class="o">=</span> <span class="mi">0</span>

<span class="no">ITERATIONS</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>

<span class="n">mh</span> <span class="o">=</span> <span class="no">MontyHall</span><span class="o">.</span><span class="n">new</span>

<span class="n">picked</span> <span class="o">=</span> <span class="n">mh</span><span class="o">.</span><span class="n">pick_door</span>

<span class="n">revealed</span> <span class="o">=</span> <span class="n">mh</span><span class="o">.</span><span class="n">reveal_door</span><span class="p">(</span><span class="n">picked</span><span class="p">)</span>

<span class="n">staying</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">mh</span><span class="o">.</span><span class="n">staying_wins?</span><span class="p">(</span><span class="n">picked</span><span class="p">)</span>

<span class="n">switching</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">mh</span><span class="o">.</span><span class="n">switching_wins?</span><span class="p">(</span><span class="n">picked</span><span class="p">,</span> <span class="n">revealed</span><span class="p">)</span>

<span class="k">end</span>

<span class="n">staying_rate</span> <span class="o">=</span> <span class="p">(</span><span class="n">staying</span><span class="o">.</span><span class="n">to_f</span> <span class="o">/</span> <span class="no">ITERATIONS</span><span class="p">)</span> <span class="o">*</span> <span class="mi">100</span>

<span class="n">switching_rate</span> <span class="o">=</span> <span class="p">(</span><span class="n">switching</span><span class="o">.</span><span class="n">to_f</span> <span class="o">/</span> <span class="no">ITERATIONS</span><span class="p">)</span> <span class="o">*</span> <span class="mi">100</span>

<span class="nb">puts</span> <span class="s2">"Staying: </span><span class="si">#{</span><span class="n">staying_rate</span><span class="si">}</span><span class="s2">%."</span>

<span class="nb">puts</span> <span class="s2">"Switching: </span><span class="si">#{</span><span class="n">switching_rate</span><span class="si">}</span><span class="s2">%."</span>

<span class="k">end</span>

Just an observation, but wouldn’t it be more correct to shuffle the doors as well? Although i guess it doesnt actually affect the probability in the situation since trials are independent?

Next time someone doubts this, you can tell them that the same principle is what keeps their inbox with a spam/ham rate less than 1.000.000

http://www.paulgraham.com/spam.html

@Niels: It wouldn’t change the outcome, but the doors are shuffled in both scripts.

@Lucas: Indeed. ðŸ™‚

For the curious,

$ time ruby monty.rb

Staying: 33.2946%.

Switching: 66.7054%.

real 0m13.506s

user 0m12.129s

sys 0m1.296s

$ time python monty.py

Staying: 33.225700%

Switching: 66.774300%

real 0m11.449s

user 0m11.305s

sys 0m0.008s

$ time python monty-psyco.py

Staying: 33.389400%

Switching: 66.610600%

real 0m4.798s

user 0m4.776s

sys 0m0.012s

…on my laptop.

Wow, that was surprising!

Here, a Groovy version of the program:

http://snipplr.com/view/10837/monte-carlo-simulation-for-the-monty-hall-problem/

Peter, I know you did it for fun, but these implementations are really two different beasts. The Python one uses functions that allows it to be more pythonic, while the Ruby one uses an utility class. Also, these implementation are meant to be “proof” for the skeptic reader who may not be a programmer, and as such are not written for speed or conciseness. I could probably write a version that would be 1/5 the length and 10 times as fast, but it would use less intuitive abstractions to model the problem at hand, and it would be less of a step-by-step simulation of the game. I feel that for these two scripts, comparative benchmarks should be left out of the picture. ðŸ™‚

Reading this problem recently, I found it extremely counterintuitive too and was planning to write something like this.

I thought the python code could be improved at places. ‘reveal_door’ is unnecessarily complicated with sets.

I rewrote reveal_door as follows –

———————-

Timings on my thinkpad –

raja@ /data/tmp:

time python monte_carlo.py

Staying: 33.264000%

Switching: 66.736000%

real 0m13.573s

user 0m13.457s

sys 0m0.012s

## Code modified ##

raja@ /data/tmp:

time python monte_carlo.py

Staying: 33.281600%

Switching: 66.718400%

real 0m12.657s

user 0m12.621s

sys 0m0.024s

Antonio,

Of course this is “mathematically correct” (Or statistically correct if you want to be specific :-)). But I think it’s one of the cases where common sense has it advantage. Although numerically the chances are between three doors, when the show’s host opens up his door, that leaves us without a door to choose from (Who would pick a door which we already know has a goat behind it).

However, these things are true and I’m quoting here something similar: Suppose that you’re going off for the weekend and I ask you “Are you sure that you turn the lights off before going out?” and you reply “Yes, I’m 99% sure”. Then I propose you this:

We have a jar with 99 white balls and 1 black ball. I’ll give you $1000 if either you can assure that you really did turn off the lights or if you’re able to pick a white ball from the jar (The probability of doing so is, in fact, 99%, the same amount you’ve said about turning the lights off). We keep doing this rising the amount of black balls in the jar (Replacing the white balls of course). Just when you’d prefer go home to check on the lights rather than picking up a white ball from the jar, is at that point you really can say “I’m XX% sure that I turned the lights off” (XX being the probability to pick up a white ball from the jar)

Antonio:

Statistics can only be used for multiple events, which is why it is invalid to apply them to a single game of Let’s Make a Deal. Without a doubt, your Monte Carlo simulation is valid for predicting the frequency of outcomes from a large sample of identical scenarios. However, for a single game show participant, faced with 2 doors only, and only one choice, why would it matter what would happen over the course of a million games? I think it makes no difference which door he chooses. In the exagerated case of the billion doors, Monty does him a favor and eliminates all but one of the goats, and at that point there are still just 2 doors. Why does it matter that there used to be a bunch of other doors with goats? How does the prior existence of other doors with goats affect the situation at the moment when the participant must make his choice? At that moment, there are 2 doors, 1 car, and 1 goat. Statistics don’t apply.

A related example is this: Let’s say that in the general population, there is a 1% incidence of a gene for a rare fatal disease. Does that mean that I personally have a 1% chance of dying of that disease? No, because I personally either have the gene or I don’t. If I haven’t been tested, I may not know whether I have it or not, but my ignorance doesn’t change the fact that I either have it or I don’t. Statistics don’t apply to single incidences of anything.

On perhaps a related note: Richard Feynman said in one of his public lectures: “While driving to the lecture, I saw a car with the license plate XRB-367. Out of all the license plates in this state, what are the odds that I would see this particular plate! This is a truly amazing circumstance! I can hardly believe it!” (Quote not exact) Of course, he was joking to make a point that unusual events happen constantly, and that you have to be careful when applying statistics to isolated events. At least, I’m pretty sure those were his points.

@Kent:

I sort of understand your line of thought, but you are confused about probability theory.

For a participant in real life, statistics does apply and he really stands a better chance if he opts for the other door.

For your genetics example, if there is a 1% prevalence of the gene in the population, you do have a 1% chance of having it. After you test for it, you know if you have it or not, but that doesnt mean statistics doesn’t mean anything here.

Same goes for the car plate. Before Feynman started for his lecture, the probability that he would see that number on a license plate is indeed small.

What are the times with Groovy?, looks interesting and very clean code. I like Python one, looks very clean too.

Raja:

Thanks for your thoughtful reply. I may be confused as you say. But it does seem strange to apply statistics to single events, doesn’t it? Maybe that’s where the counterintuitiveness arises from.

I think what is driving my thinking on this is the common example from statistics classes of the coin flipped 9 times in a row, all heads. The question then is asked, what is the probability that the 10th flip will be heads as well? The answer (50%) is counterintuitive to many people, who think: What are the odds of getting heads 10 times in a row? Of course the coin doesn’t “know” that it was previously flipped 9 times in a row with a heads result.

I was trying to apply similar reasoning to the Monty Hall question, i.e. the two remaining doors don’t “know” that there used to be other doors in the equation. But this reasoning is faulty, you are saying.

Thanks again for a good discussion.

thx for the nice post!

Here’s a variation that I really like. One day on the show, Monty forgets to find out where the car in advance. In the spirit of “the show must go on”, he opens one of the doors and luckily, reveals a goat. Nobody else knows that he had any doubts. What are the odds now for staying vs switching?

Think about the Monty Hall question with a hundred doors. You pick one at random. Monty opens all the others but one. Only your door and one other remain closed. Do you switch?

What if we start with a million doors?

Jonathan, did you read the part of my post were I talked about a billion doors? ðŸ™‚

@Kent: Unlike the coin flipping example, Monty isn’t opening doors at random — he is opening doors that he knows do not have the car behind them. So your initial choice was at random but Monty’s choices definitely are not.

@Tim Kington: Interesting case, but I don’t think it changes anything. Monty had a 50-50 chance of opening the wrong door and showing a car. Since he got lucky and showed a goat, I now still have a 2/3 chance of getting the car by switching. Think about the 1 billion doors again. Monty forgets which door has the car. He then opens all but one and he gets lucky (very lucky!) and opens doors with only goats. Are you going to switch? I certainly would.

@Chad: It’s very counter intuitive, but it does change the probability. Switching doesn’t matter if Monty doesn’t know where the car is. In your case with a billion, it seems like you have a better chance by switching, but you actually don’t – Monty will accidentally show you the car almost every time, and the chance of him luckily not exposing it is the same as you choosing it in the first place. You can prove it by running a simulation. When Monty opens a door, you have him choose it at random, and you throw out the cases where he accidentally exposes the car.

As a result of reading your column, I wrote up a quick Monte Carlo Solution to the Birthday Problem:

Given N number of people with randomized birthdays (excluding Feb 29th birthdays), what is the minimum value of N where you can expect (ie: 50% probability) at least two people having the same birthday?

http://gist.github.com/44189

Surprisingly enough, the answer is 23.

I find it easier to think of this as if you are actually switching from your single chosen door to both of the two other doors.

Since the rules are known before the game you know you’ll pick one door and then you’ll be able to switch to the other two. Since you know you’ll switch you can think of it as if you already “own” the two other doors. When Monty opens one of them, he is just helping you to open a losing door first.

It’s actually just like if you were given the choice of selecting one of the three doors or two of the three doors.

What’s really amazing is the number of posts here without a single disrespectful comment. In this day of internet anonymity, what are the odds of that?

I almost said something awful just because of your comment, Kent.

Not really, just kidding.

On my desktop :

What if I switch the previously selected door with itself? It’s the same as switching with the other one, right?

I’m no math-geek at all, I just want to see the light because I’m really puzzled! ðŸ™‚

Monty Hall problem

Lets suppose the contestant comes on stage and selects door # 1 in the hope that a prize is in door 1 and not a goat. Before Monty Hall does or says anything, an associate comes out and whispers something to Monty Hall. Now Monty Hall turns to the contestant and says: I just have been informed that you are a veteran. Therefore I want to better your odds and have you decide only between 2 doors. He than tells a Stage hand to remove Door 3 from the stage. ( Of course Monty knew that there was a goat behind door 3). Now he turns to the contestant and says: “There you are, 2 doors. New game. You want to switch from door # 1 to door # 2, or stay with your original choice?” Same setup as in the original problem, nothing changed except there only 2 doors now, and the contestant has now a chance of staying with his original choice or selecting the other door.

That’s now 50-50 probability, even though it’s the same problem. Which means the original problem must also be a 50-50 proposition.

Gustav, it’s not a true 50/50. Since he picked his door when there were more goats than cars, the veteran’s door is more likely to have a goat in it, His door, therefore, is tainted by previous bad odds.

The only time it’d be 50/50 is if Monty removed the goat first, then let the vet select a door. One may think this is the same thing, but remember: the door eliminated in your scenario is guaranteed NOT to be the contestant’s door. This means the contestant keeps the odds he was originally dealt (1:3), unless he switches.

How do you modify the script to simulate a case where there are say 2 cars and 5 goats?