A few days ago I found myself having a vague recollection of an interesting statistics problem. All I could remember was that it had to do with having a room full of people and the probability that any two people in that room would have the same birthday. I remembered the point, which was that it is much more likely than you might think, but I was fuzzy on the details.
After trying to define the problem and find an answer mathematically, I remembered that I suck at statistical reasoning about as much as the average person. So I decided to model the problem with a short Python script and find the answer that way.
Sure, I could’ve looked it up, but where’s the fun in that?
The problem: There are n people (say, at a party) drawn randomly from a population in which the chances of having a birthday on any day is equal to having a birthday on any other (which is not true of real populations (probably)). What is the probability of there being at least two people with the same birthday in the sample?
To put this thing together, I figure we need three things:
- The ability to generate random numbers (provided by Python’s random module);
- An object representing each person;
- A party object full of those people.
Then we can add things like the ability to choose how many people we want at the party and how many parties to have, as well as some output for making plots!
First, the Person object. All each person needs is a birthday:
def __init__( self ):
self.birthday = random.randint( 1, 365 )
This post discusses a computer program that you can download to try yourself (and get the source code if you want to make your own version).
At a family reunion earlier this summer, we were handed a wordfind that someone had generated somewhere on the Internets that contained the names of the family founders. I was solving mine and noticed that, as anyone has frequently observed, in any given wordfind you will find words that are not in the list. Presumably, this is due to the randomly-assorted letters, by chance, spelling out an unplanned word. Of course, the wordfind makers might also stick those in on purpose (for example, the family wordfind contained the website name multiple times) or purposely prevent some random words (profanity). Regardless, I began to wonder how often a word might appear in a word find just by chance. So I used the margins to scratch out a formula for the chance of finding a word of a certain length within a matrix of random letters.
Inspired by Dawkins’ METHINKS IT IS LIKE A WEASEL program (hereafter just weasel) described in his book “The Blind Watchmaker,” and wanting to practice my blossoming C++ skills, I decided to write my own version of weasel. It was successful enough, and I found the results interesting enough to warrant discussion. Download the program (Windows .exe file) so you can try it out for yourself (and you can also get the source code if you want). In this post I’ll discuss what the program does and why. In the next post I’ll talk a bit about the results of the program.