Andreas Weigend
Stanford University
Stat 252 and MS&E 238
Spring 2008

Homework 4

The login info for homework 4 went out Thursday May 1 2008 at 2:20pm by email. It won't be posted here on the wiki because that would make the FFS database publicly available. James Mao wrote some starter code to help you access the database, which also was sent out by email.

In your same group from Homework 3, you should answer 4 of your proposed questions using the FFS data. For each question, make a plot and give a brief explanation. Here we are looking for clarity of thought and presentation, so don't overload us with extraneous information!

Please submit your homework by Thu May 8 , 2008 (5pm).

Hi guys -
This is Siqi. I've been told that some groups have had trouble accessing the data. Please let me know what the issues are and I can try to help out. I am seeing queries from Stanford on our servers so the login info you have appears to be correct.
- Siqi

Siqi: The downloads are incredibly slow. I haven't been able to download an entire dataset from any of your tables. The most I've been able to get is ~20K rows.

Didn't he say that the tables are something like 500GB? I think you have to do it directly with the python/SQL starter code, which has worked great for me...

I ended up with a couple of text files that were 100mb+, my text editor and R wouldn't open them, so I wrote a grep script to aggregate them. - Ryan