A few days ago I found myself having a vague recollection of an interesting statistics problem. All I could remember was that it had to do with having a room full of people and the probability that any two people in that room would have the same birthday. I remembered the point, which was that it is much more likely than you might think, but I was fuzzy on the details.
After trying to define the problem and find an answer mathematically, I remembered that I suck at statistical reasoning about as much as the average person. So I decided to model the problem with a short Python script and find the answer that way.
Sure, I could’ve looked it up, but where’s the fun in that?
The problem: There are n people (say, at a party) drawn randomly from a population in which the chances of having a birthday on any day is equal to having a birthday on any other (which is not true of real populations (probably)). What is the probability of there being at least two people with the same birthday in the sample?
To put this thing together, I figure we need three things:
- The ability to generate random numbers (provided by Python’s random module);
- An object representing each person;
- A party object full of those people.
Then we can add things like the ability to choose how many people we want at the party and how many parties to have, as well as some output for making plots!
First, the Person object. All each person needs is a birthday:
import random random.seed() class Person: def __init__( self ): self.birthday = random.randint( 1, 365 )