Resolving the Birthday Paradox with Matlab
28 Feb 2008 Quan Quach 2 comments 1197 views

Photo taken by chipgriffin
Go shawty. It’s your birthday. We gonna party like it’s your birthday. -50 Cent
Since I started this week with a riddle/puzzle/problem, I figured I might as well end the week with another riddle/puzzle/problem. A very well known problem that I would like to discuss today is the Birthday Paradox. The problem is given this moniker because the solution is so surprising that people simply cannot believe it!
The Problem Statement
- To keep things a little simpler, let’s assume that we exclude any birthdays on February 29th (Leap Year babies). In addition, let’s assume that all 365 birthdays are equally likely to occur.
- Assume that there are x number of people in a given room and that these people are all chosen at random.
- How many people would we need to have in the room in order to have better than 50% chance that there will be at least two people who share the same birthday?
- How many people would we need to have to have a 99% chance?
This problem can be solved if you have a good background in probability. There’s a pretty elegant solution that can be arrived at analytically. You can see how it’s done here. But if you have no idea on how to proceed, let’s try doing it in Matlab.
Solving the Birthday Paradox with Matlab
-
Using Matlab to solve this problem might be a little easier than you would expect (it was certainly easier than I expected). My approach was to first write a functon that would simulate the percentage for a given number of people. I saved this function as an m-file. You can see the code below.
-
function result = birthdayMatch(numPeople) total = 0; %do the simulation for 20,000 iterations for x = 1:20000 %create a random vector of birthdays for the specified number of people birthdayVector = ceil(rand(1,numPeople)*365); %the mode command returns the number (a) that occurs the most %if there are two numbers that occur with the same frequency, %the lowest number is returned %the second return argument (b) is the number of times (a) occurs [a,b] = mode(birthdayVector); %if there is a birthday match if b > 1 total = total + 1; end end result = total/x; %get the average percent of a matching birthday
-
So now we have a function that will perform 20,000 iterations of the event, we can use this function to solve the above problem
-
At the matlab command prompt, we can enter the following:
%the following commands might take a couple of seconds to execute %so be patient! %get the percent chance of a birthday match with 5 people fivePeople = birthdayMatch(5) %get the percent chance fo a birthday match with 23 people twentythreePeople = birthdayMatch(23) %get the percent chance fo a birthday match with 57 people fiftysevenPeople= birthdayMatch(57)
The Answer!
The theoretical answer for this problem is 23 people for 50% chance and 57 people for 99% chance.
The simulations basically confirm that the theoretical answer is indeed correct, but the answers might be off by a little bit. If you want more precise answers, you can increase the number of iterations used. Of course, this will cause the computation time to increase as well.
How Would You Solve This Problem?
Did any of you guys have a different idea on how to solve this? I would love to hear how you guys went about solving this problem. Whether you used analytical methods or used a computer program, I’m always interested in how problems can be approached from different angles.
My first Math Problem in College
This was the first problem that was presented to me in Arthur Benjamin’s probability class at Harvey Mudd. For those of you who don’t know, Benjamin is a world renowned mathemagician (mathematician + magician). Click here if you’re interested in seeing some of his spiffy tricks and feats.
2 Responses to “Resolving the Birthday Paradox with Matlab”
Leave a Reply
Include MATLAB code in your comment by doing the following:
<pre lang="MATLAB">
%insert code here
</pre>

In the spirit of matlab, you should be vectorizing your solution….
x = 20000;
birthdayVector = randsample(365,x*numPeople,true);
birthdayVector = reshape(birthdayVector,[x numPeople]);
[a b]=mode(birthdayVector,2);
numel(b(b>1))/x
I was waiting for someone to say that! Yes, vectorizing is good, but I used the method that I did because I thought it might be a little easier to understand. Thanks for the feedback!