In code on
8 June 2009 tagged python, statistics with no comments
I’ve recently been brushing up on my statistics by reading Principles of Statistics by M.G. Bulmer. I came across a problem and since I wrote some code to check my answer, I figured I’d post it with a short discussion about the answer.
First, the question:
In a certain survey of the work of chemical research workers, it was found, on the basis of extensive data, that on average each man required no fume cupboard for 60 per cent of his time, one cupboard for 30 per cent and two cupboards for 10 per cent; three or more were never required. If a group of four chemists worked independently of one another, how many fume cupboards should be availabe in order to provode adequate facilities for at least 95 per cent of the time?
My line of thinking to solve this was to find every combination of the 4 chemists needing 0, 1, or 2 cupboards, the probability of each of those combinations happening, and finally summing up the probability of all the hoods needed.
For example, out of the 81 different possible combinations of cupboards required (3 * 3 * 3* 3, with the 3 coming from 0, 1, or 2 hoods needed), there is only one way where 0 hoods are needed in total and this is where all 4 chemists need no cupboards. Following this, there are 4 ways to have 1 hood total be required, with each of the chemists exclusively requiring a cupboard and the other three needing none (1, 0, 0, 0 & 0, 1, 0, 0 & 0, 0, 1, 0 & 0, 0, 0, 1).
So having one cupboard covers the probability of needing no cupbards amongst the 4 PLUS the probability of needing 1 cupboard amongst the 4.
I first did this problem long-handed, figuring out the probability of 0, 1, 2, and so on cupboards until I got to a sum that had a probability > 0.95. I was making a simple arithmetic error (as usual) and my answer was not matching up with what was in the back of the book, so I thought I would write a simple program to calculate the answer since I was confident in what I was trying to do, but was just having trouble multiplying and adding.
Here’s the Python script I wrote (note: you need Python >= 2.6 as I use itertools.product to recreate all the combinations of cupboards needed).
The output is the summing of each of the probability of needing 0 cupboards + 1 cupboard + 2 cupboards and so on. The line with the probablity greater than 0.95 is the answer. In this the case, the answer was 4, which would cover the chemists’ needs 95.85 percent of the time.
from collections import defaultdict
import itertools
probs = {
'0': 0.6,
'1': 0.3,
'2': 0.1
}
trials = itertools.product('012', repeat=4)
totals = defaultdict(float)
for trial in trials:
#how many hoods needed in this trial
trial_sum = sum(map(lambda x: int(x), trial))
#figure probability of exact trial occurring
total_prob = 1.0
for item in trial:
total_prob *= probs[item]
#add probability of trial to total prob for this number of hoods needed
totals[trial_sum] += total_prob
#print out all probabilities
keys = totals.keys()
keys.sort()
running_prob = 0.0
for i in keys:
running_prob += totals[i]
print i, running_prob * 100
In code on
23 April 2009 tagged git, github, twitter with no comments
It was a night of firsts.
I wrote my first Twitter widget. It is the ‘Latest Tweet’ widget on the far right column of the page that uses Twitter’s public JSONP API to pull in my last Twitter update.
The only thing even mildly interesting is that it has a couple of regexes that finds any ‘@’ names and links them up and (naively) hooks up any hyperlinks as well. These weren’t difficult, but they always take a little bit of tinkering to get right.
More excitingly, I think, is that I decided to push it out to github under a BSD license. This is technically my first open source software, minor as it is.
In photography on
17 April 2009 with no comments
I haven’t put up a new post in awhile, but I just uploaded a big batch of pictures to my Flickr account, so I figured I’d post a few of them here as a substitute.







In books, code on
13 March 2009 tagged linux, lulu, programming, sicp with no comments
I’ve been wanting to read SICP for awhile, but with lots of other books on my to-read list, as well as the $50 dollar price tag for a used copy, I’ve put it on hold. The price, while relatively steep, usually doesn’t stop me from picking up a highly-desired book, but I held off mainly as the book is freely available on their website, under the Creative Commons Attribution-Noncommercial license and this seems like a lot to pay for a free-as-in-beer book.
Since SICP runs close to 600 printed pages and approximately 40 HTML files , I’d rather not read it in my browser and printing it on the home printer is not really an option. I decided that using Lulu might be a workable solution.
Lulu takes PDFs so step one was to convert the SICP website to one big PDF.
First, I used wget to mirror the site. Now that I had all the files, I wanted to clean them up a little. Every single page had the previous and next links at the bottom and this was obviously not needed when the pages are in physical form. I ran the following sed command to remove these lines:
sed -i "/\[Go to/d" *html
The next step was to convert the HTML to PDF. I used htmldoc for this particular task. First, I put all the names of the HTML files in one text file, on one line, and in the correct order. I called this file “all_files.txt”. The htmldoc command I used to convert to PDF is the following:
htmldoc -f sicp.pdf --webpage --left .75in --right .75in `cat all_files.txt`
I then uploaded this file up to Lulu and designed my (very) simple cover. I made it clear on the back cover text that I was printing this book under the rights granted by the aforementioned license and would receive no profit from this book with a link back to the original source. I’m not a lawyer so I hope that covers all bases.
Lulu has a convenient feature that will let you do a private printing. I could probably make this book public, setting my profit to zero, and even though that would be covered under the license, it still feels strange to do.
I am very curious how this book will turn out. The Lulu process was actually fun and if this turns out well, I could see myself using the service again. Once I get the book, I’ll post my reviews of the service and possibly some pictures of the final product. Nonetheless, I’m excited to get a print version of this book for a much-reduced price.
Edited to add:
Here’s the download for the PDF: sicp.pdf