The Riddle
There are 50 prisoners, and one prison keeper. The prisoners will be placed in separate cells, never able to communicate. Once per day, the prison keeper will randomly select one prisoner and take him into a room with two levers. The levers have two positions each: Up, and down. The prisoner may operate the levers however he wishes, then he must return to his cell. (The levers don't do anything; they just represent a binary state.) The prisoners have one chance to indicate to the prison keeper whether all the prisoners have been in the lever-room. If they so indicate too early, they will never be released. If they're correct (even if they're late), they will be released immediately. The prisoners get one chance to meet prior to imprisonment, to discuss strategy. How can they ensure their release?The Answer
Only one lever is needed. Elect one prisoner to be a 'counter.' When a prisoner enters the lever room and the lever is in the 'down' position and he has never moved the lever, then and only then will he move the lever to the 'up' position. Only the 'counter' prisoner may return the lever to the 'down' position, and each time he does so he increments a counter. When the counter reaches 50, he knows that all prisoners have been into the lever room at least once. He can then inform the prison guard, and all the prisoners will be released.A Solution in Python
I came up with the following model in Python to describe the prison scenario. There are probably better and more optimized way to represent this situation, but here's the first one I came up with.#Iteration 1: The basic scenario import random day_number = 0 prisoner_designee = 1 # chosen by prisoners to be the official counter counter = 0 # as tracked by the prisoner_designee lever_position = 'down' # starting position of the lever prisoners_who_have_operated_lever = [] while counter < 50: prisoner_number = random.randint(1,50) # Select a prisoner if not prisoner_number in prisoners_who_have_operated_lever: if lever_position == 'down': prisoners_who_have_operated_lever.append(prisoner_number) lever_position = 'up' if prisoner_number == prisoner_designee: # If it's the designee... if lever_position == 'up': # ...and the lever is up... lever_position = 'down' # ...put the lever back down... counter += 1 # ...and increment the counter. day_number += 1 print day_numberThe above code, iteration 1, uses
random.randint
to generate a random integer. I used if
statements to describe the logic, and printed out the result. Example:$ python prisoner1.py 2536This means that for this particular run, it would have taken the prisoners (or rather, the prisoners' designated counter prisoner_designee) 2,536 days to count 50 'up' levers, proving that all prisoners had entered the lever room and ensuring their release. That's just under seven years(!).
But this result depends on the order in which the prison keeper (in our case represented by random.randint) chose the prisoners. Running this code at the command line produces different results each time, usually in the range of 2500--3000 (days). To get a better picture of how long it would take a new group of prisoners to earn their release we'd have to simulate the whole range of possible outputs (or which there are nearly infinitely many) But I don't want to run this manually dozens of times and tabulate the results by hand; I want to make the code run the simulation thousands of times, with random inputs each time, and then aggregate the results. That's the Monte Carlo method.
Implementing the Monte Carlo Method
We're going to use a loop to run the experiment, so first, wrap the experiment in a function:#Iteration 2: Wrapping the simulation in a function import random def run_simulation(): day_number = 0 prisoner_designee = 1 # chosen by prisoners to be the official counter counter = 0 # as tracked by the prisoner_designee lever_position = 'down' # starting position of the lever prisoners_who_have_operated_lever = [] while counter < 50: prisoner_number = random.randint(1,50) # Select a prisoner if not prisoner_number in prisoners_who_have_operated_lever: if lever_position == 'down': prisoners_who_have_operated_lever.append(prisoner_number) lever_position = 'up' if prisoner_number == prisoner_designee: # If it's the designee... if lever_position == 'up': # ...and the lever is up... lever_position = 'down' # ...put the lever back down... counter += 1 # ...and increment the counter. day_number += 1 return day_number print run_simulation()Running this code should produce the same type of output (with a different result, of course):
$ python prisoner2.py 2950The next step is to write a loop that will call this function a thousand times:
#Iteration 3: Loop it to run 1000 times import random def run_simulation(): counter = 0 day_number = 0 prisoner_chief = 1 lever_position = 'down' prisoners_who_have_operated_lever = [] while counter < 50: prisoner_number = random.randint(1,50) # Select a prisoner if not prisoner_number in prisoners_who_have_operated_lever: if lever_position == 'down': prisoners_who_have_operated_lever.append(prisoner_number) lever_position = 'up' if prisoner_number == prisoner_chief: if lever_position == 'up': lever_position = 'down' counter += 1 day_number += 1 return day_number simulation_results = [] simulations = 0 while simulations < 1000: simulation_results.append(run_simulation()) simulations += 1 print simulation_resultsIn this iteration we use a
while
loop to run the simulation 1000 times, storing the results in the list called simulation_results. Here's an example of the output:$ python prisoner3.py [2444, 2119, 2818, 2253, 2586, 2543, 2490, 3388, 2034, 2739, 2554, 2585, 2498, 2689, 3180, 2760, 3145, 2698, 2196, 2769, 2400, 2783, 3091, 2258, 2952, 1730, 2974, 2656, 3059, 2000, 3222, 2186, 3114, 2618, .... ](List truncated.)
The last step in the Monte Carlo method is to aggregate and interpret the results. In this case, that means taking the average of all the simulation runs. A simple way to do this with lists is to sum the list :
average = sum(list)/len(list)A more elegant way is to use Numpy's built-in
mean
function, which requires first converting the list
to a Numpy array
:import numpy
array = numpy.array(list)
average = numpy.mean(array)The next iteration of the code will include this small refactorization. Running this code produces the following result:
$ python prisoner3.py 2720.895Now, instead of one experiment, this result represents the average of 1000 experiments.