Recent Changes

Sunday, April 12

  1. page HW3 edited ... Homework 3 In this homework you'll be thinking about the Friends for Sale facebook applicatio…
    ...
    Homework 3
    In this homework you'll be thinking about the Friends for Sale facebook application. You'll be given access to internal data from the FFS application. Siqi Chen has written up a detailed overview of the types data you can expect, which you can find here: {FFSschema.txt} . In a group of 4 people, pose 5 questions about the FFS data, and say a little about why each question is relevant/worthwhile. Remember that next week your group is going to go ahead and try to answer these questions, so make sure they're feasible!
    ...
    per group to stat252@gmail.com by Sunday at 5pm.
    Also, don't forget to read Bo Cowgill's paper on prediction markets before next class: http://www.bocowgill.com /GooglePredictionMarketPaper .pdf

    (view changes)
    5:18 pm
  2. page HW4 edited ... The login info for homework 4 went out Thursday May 1 2008 at 2:20pm by email. It won't be pos…
    ...
    The login info for homework 4 went out Thursday May 1 2008 at 2:20pm by email. It won't be posted here on the wiki because that would make the FFS database publicly available. James Mao wrote some starter code to help you access the database, which also was sent out by email.
    In your same group from Homework 3, you should answer 4 of your proposed questions using the FFS data. For each question, make a plot and give a brief explanation. Here we are looking for clarity of thought and presentation, so don't overload us with extraneous information!
    Please submitesubmit your homework
    Hi guys -
    This is Siqi. I've been told that some groups have had trouble accessing the data. Please let me know what the issues are and I can try to help out. I am seeing queries from Stanford on our servers so the login info you have appears to be correct.
    (view changes)
    5:17 pm
  3. page HW4 edited ... Spring 2008 Homework 4 ... homework 4 when went out Thursday May 1 2008 at 2:20pm In…
    ...
    Spring 2008
    Homework 4
    ...
    homework 4 whenwent out Thursday May 1 2008 at 2:20pm
    In your same group from Homework 3, you should answer 4 of your proposed questions using the FFS data. For each question, make a plot and give a brief explanation. Here we are looking for clarity of thought and presentation, so don't overload us with extraneous information!
    YourPlease submite your homework should be submitted to stat252@gmail.com by ThursdayThu May 8th at 5:00pm. 8 , 2008 (5pm).
    Hi guys -
    This is Siqi. I've been told that some groups have had trouble accessing the data. Please let me know what the issues are and I can try to help out. I am seeing queries from Stanford on our servers so the login info you have appears to be correct.
    (view changes)
    5:17 pm

Friday, April 10

  1. 12:44 pm

Monday, March 30

  1. page Jobs and Internships edited ... We are a young and small team, but we have a very clear sense of direction about where we are …
    ...
    We are a young and small team, but we have a very clear sense of direction about where we are going, and vision on where the discovery space is headed.
    We believe we have just the right amount of experience to make this happen. Not too much, not too little. ;) See here for more details on the team!
    LikeMe
    JEFF, TEXT GOES HERE

    YourVersion, Palo Alto
    Dan Olsen (dan@yourversion.com), founder
    (view changes)
    5:58 pm

Monday, March 9

  1. page home edited ... Stanford University Stat 252 and MS&E 238 Spring 2008 class time: Monday 3:15 - 6:05 pm…
    ...
    Stanford University
    Stat 252 and MS&E 238
    Spring 2008 class time: Monday 3:15 - 6:05 pm
    Spring 2008 class location: Gates B01

    Data Mining and Electronic Business
    This page is part of the course wiki, http://stanford2008.wikispaces.com/. If you are enrolled in the course, you should have write permission. If there is any problem, please click on the top of this page to ask for permission. If there is any problem, please email one of the teaching assistants, info below.
    (view changes)
    1:27 am

Friday, February 27

  1. page CourseDescr2008 edited Stanford University Stat 252 and MS&E 238 Spring 2009 2008 class time: Monday 2:15 3:15 …
    Stanford University
    Stat 252 and MS&E 238
    Spring 20092008 class time: Monday 2:153:15 - 5:056:05 pm
    Spring 20092008 class location: TBD (probably Gates B01)
    This file (weigend.com/teaching/stanford) is the official course page.
    Other resources are probably more important:
    Spring 2008 list of students
    [[|Spring 2008 course wiki]]
    Spring 2007 course wiki
    B01
    Data Mining and Electronic Business
    This course is about people and data: Collecting data about behavior on the web, in social networks, in communication, on dating sites, etc. Mining the data, building predictive models, creating (and rejecting) hypotheses, designing cool experiments, and learning from them quickly. And figuring out what is truly new, what is similar to the past, and what the underlying drivers are. We will discuss the impact of the communication and data revolution on individuals, business, and society, i.e., to many aspects of the world we live in.
    (view changes)
    5:31 pm
  2. page CourseDescr2008 edited Stanford University Stat 252 and MS&E 238 Spring 2009 class time: Monday 2:15 - 5:05 pm Spr…
    Stanford University
    Stat 252 and MS&E 238
    Spring 2009 class time: Monday 2:15 - 5:05 pm
    Spring 2009 class location: TBD (probably Gates B01)
    This file (weigend.com/teaching/stanford) is the official course page.
    Other resources are probably more important:
    Spring 2008 list of students
    [[|Spring 2008 course wiki]]
    Spring 2007 course wiki
    Data Mining and Electronic Business
    This course is about people and data: Collecting data about behavior on the web, in social networks, in communication, on dating sites, etc. Mining the data, building predictive models, creating (and rejecting) hypotheses, designing cool experiments, and learning from them quickly. And figuring out what is truly new, what is similar to the past, and what the underlying drivers are. We will discuss the impact of the communication and data revolution on individuals, business, and society, i.e., to many aspects of the world we live in.
    The 90’s, the decade of algorithms (data mining), focused on the question: "Given these data, what insights can you get?". Great algorithms were invented, refined, and their strengths and weaknesses understood. The current decade is the decade of data (data mining), and the question has shifted to: "Given these problems, what data can you get?". Furthermore, economic aspects of data are becoming increasingly important, with the question becoming "Who pays whom?".
    The first half of the course focuses on data: Click data (what all can be collected and what it is useful for), intention data (such the queries from the searches you do, we will also discuss social search), attention data (such as tags on social bookmarking sites with its important application for discovery), and interaction data (of email headers and social networking sites). The second half of the quarter focuses on models and on creating appropriate structures and incentives. We will discuss models for products (recommender systems), people (reputation systems), situation and location.
    The second half discusses applications. They range from personalization, recommendations and online marketing (behavioral and situational targeting), to the principles behind collective intelligence, reputation systems and peer-production, as well as prediction markets as yet another way of gleaning data from people and fostering interactions between them.
    Students are expected to actively engage in class discussions, to have their assumptions challenged, and to bring their various backgrounds to bear to make it a great experience for themselves and everybody else. We will also have some great guest speakers come to class.
    After each class, a detailed write-up is created by the students as the [[|course wiki]] (see 2007). To help prospective students with the decision of whether to take this course, previous syllabi (2004, 2005) might also be useful.
    Schedule: We meet once a week (Monday afternoons) for 3 hours. The dates in Spring 2008 are:
    Apr 7 The Business of Data
    Apr 14 Click, Intention, and Attention Data
    Apr 21 Social Networks and Viral Marketing
    Apr 28 Prediction Markets
    May 5 Reputation Systems, Instrumenting the Planet
    May 12 Location Data (Mobile)
    May 19 Discovery Systems (Products, People)
    [no class on May 26, Memorial Day]
    Jun 2 Personal Genome (tentatively guest from 23andme)
    Jun 6 Outlook, and Project presentation by students
    Note that the last class is our slot for finals: Friday, 12:15 - 3:15
    Meeting only once a week proved useful in the past since it makes it as easy as possible for students to attend class in person. This is a lot more fun than just watching it over the web, and you learn a lot more. Note that this explicitly includes SCPD students who only signed up for remote access, just do not tell anyone :)
    Course wiki: All students have full read/write access to the course wiki at [[|stanford2008.wiskispaces.com]]. I encourage you to actively contribute -- the class and you will benefit.
    Grading: The main goal is that you get insights and that you transfer them to your area, coming up with some interesting ideas and applications. To support this objective, your grade will be determined by the following:
    Course wiki: We will form 8 groups. Each group is responsible to create the initial wikipage for one of the classes by Friday 6pm (i.e., 4 days after class). These pages emphasize the key learnings of each class and have links to other materials wherever useful. [40%]
    Homework: There will be assignments. They are due the day before class at 5pm, such that we can look through them and give brief feedback in a timely manner. [40%]
    Class participation. [20%]
    Project: If you have a good and solid idea for an interesting project, I am happy to give feedback and jointly decide on whether it makes sense to do the project. I encourage projects in small groups. [optional]
    There are also internship opportunities available for students who like to code, both in the Bay Area and abroad, ranging from Bangkok (Agoda, online travel) to Helsinki (Fruugo, e-business).
    Readings
    Some of the material is very recent and originates from several academic disciplines. Besides statistics and computer science, it discusses modern marketing techniques, behavioral economics, social network analysis ideas and other concepts. Depending on your specific background and interests, the following might be useful:
    T. Segaran: Collective Intelligence (2007) Hands on, hacker mentality, includes python code, useful for the del.icio.us recommendation engine homework
    P. Baldi, P. Frasconi, and P. Smyth: Modeling the Internet and the Web (2003) Background on web technology, solid statistical modeling of behavior, information retrieval
    C. Shapiro, and H.R. Varian: Information Rules (1998) Short book with insights about the networked economy (network effects, economics of digital goods, pricing, etc.)
    M.J.A. Berry and G.S. Linoff: Data Mining Techniques (pdf) (2004) Applications of data mining in broad marketing and business in general (not just web)
    T. Hastie, R. Tibshirani, and J.H. Friedman: The Elements of Statistcal Learning (2003) The classic for more theoretical aspects in data mining
    C.M. Bishop: Pattern Recognition and Machine Learning (2006) Recent book on machine learning from a Bayesian perspective
    Readings and mp3 recordings of the classes are online at weigend.com/files/teaching/stanford/. We also have a facebook group for the class.
    Teaching Assistants, office hours and other information is on the [[|course wiki]].

    (view changes)
    5:28 pm

Friday, September 12

  1. page 7. May 19, 2008 edited Deep recursion on subroutine "HTML::WikiConverter::Normalizer::_normalize_css" at /home/…
    Deep recursion on subroutine "HTML::WikiConverter::Normalizer::_normalize_css" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter/Normalizer.pm line 205.
    Deep recursion on subroutine "HTML::WikiConverter::Normalizer::_normalize_css" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter/Normalizer.pm line 205.
    Deep recursion on subroutine "HTML::WikiConverter::Normalizer::_normalize_attrs" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter/Normalizer.pm line 239.
    Deep recursion on subroutine "HTML::WikiConverter::Normalizer::_normalize_attrs" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter/Normalizer.pm line 239.
    Deep recursion on subroutine "HTML::WikiConverter::wikify" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter.pm line 624.
    Deep recursion on subroutine "HTML::WikiConverter::get_elem_contents" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter.pm line 296.
    Deep recursion on subroutine "HTML::WikiConverter::wikify" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter.pm line 624.
    Deep recursion on subroutine "HTML::WikiConverter::get_elem_contents" at /home/site/wikispaces.com/release/current/lib/perl/HTML/WikiConverter.pm line 296.

    Andreas Weigend
    Stanford University
    (view changes)
    12:10 pm

More