Table of Contents

Table of Contents
Content Page
Overview and Philosophy 8
Scope and Sequence 14
UNIT 1
Campaign Topics
Daily Overview 19
Essential Concepts 20
Section 1: Data are all Around 22
Lesson 1: Data Trails Defining data, consumer privacy 24
Lesson 2: Stick Figures Organizing & collecting data 26
Lesson 3: Data Structures Organizing data, rows & columns, variables 28
Lesson 4: The Data Cycle Data cycle, statistical questions 30
Lesson 5: So Many Questions Statistical questions, variability 35
Lesson 6: What Do I Eat? Food Habits Data cycle, collecting data 39
Lesson 7: Setting the Stage Food Habits – data Participatory Sensing 42
Section 2: Visualizing Data 47
Lesson 8: Tangible Plots Food Habits – data Dotplots, minimum/maximum, frequency 48
Lesson 9: What Is Typical? Food Habits – data Typical value, center 52
Lesson 10: Making Histograms Food Habits – data Histograms, bin widths 54
Lesson 11: What Shape Are You In? Food Habits – data Shape, center, spread 57
Lesson 12: Exploring Food Habits Food Habits – data Single & multi-variable plots 59
Lesson 13: RStudio Basics Food Habits – data Intro to RStudio 61
Lab 1A: Data, Code & RStudio Food Habits – data RStudio basics 64
Lab 1B: Get the Picture? Food Habits – data Variable types, bar graphs, histograms 68
Lab 1C: Export, Upload, Import Importing data 71
Lesson 14: Variables, Variables, Variables Multi-variable plots 76
Lab 1D: Zooming Through Data Subsetting 80
Lab 1E: What’s the Relationship? Multi-variable plots 84
Practicum: The Data Cycle & My Food Habits Food Habits Data cycle, variability 87
Section 3: Would You Look at the Time 89
Lesson 15: Americans’ Time on Task Time Use – data Evaluating claims 90
Lab 1F: A Diamond In the Rough Time Use – data Cleaning names, categories, and strings 95
Lesson 16: Categorical Associations Time Use – data Joint relative frequencies in 2- way tables 100
Lesson 17: Interpreting Two-Way Tables Time Use – data Marginal & conditional relative frequencies 102
Lab 1G: What’s the FREQ? Time Use – data 2-way tables, tally 107
Practicum: Teen Depression Time Use Statistical questions, interpreting plots 110
Lab 1H: Our Time Data cycle, synthesis 112
End of Unit 1 Project: Evaluating Claims from the Media Data cycle 113
UNIT 2 Campaign Topics
Daily Overview 115
Essential Concepts 116
Section 1: What is Your True Color? 118
Lesson 1: What Is Your True Color? Personality Color - data Subsets, relative frequency 120
Lesson 2: What Does Mean Mean? Personality Color Measures of center – mean 123
Lesson 3: Median In the Middle Personality Color Measures of center – median 127
Lesson 4: How Far Is It from Typical? Personality Color Measures of spread – MAD 130
Lab 2A: All About Distributions Personality Color Measures of center & spread 134
Lesson 5: Human Boxplots Boxplots, IQR 136
Lesson 6: Face Off Comparing distributions 139
Lesson 7: Plot Match Comparing distributions 142
Lab 2B: Oh, the Summaries… Personality Color Numerical summaries, custom functions 144
Practicum: The Summaries Food Habits or Time Use Data cycle, comparing distributions 147
Section 2: How Likely is it? 149
Lesson 8: How Likely is It? Probability, simulations 150
Lesson 9: Dice Detective Simulations to detect unfairness 153
Lesson 10: Marbles, Marbles Probability, with replacement 157
Lab 2C: Which Song Plays Next? Probability of simple events, do loops, set.seed() 159
Lesson 11: This AND/OR That Compound probabilities 162
Lab 2D: Queue It Up! Probability with/without replacement, sample() 166
Practicum: Win, Win, Win Probability estimation through repeated simulations 169
Section 3: Are You Stressing or Chilling? 170
Lesson 12: Don’t Take My Stress Away Stress/Chill – data Introduction to campaign 172
Lesson 13: The Horror Movie Shuffle Stress/Chill – data Chance differences – categorical 176
Lab 2E: The Horror Movie Shuffle Stress/Chill – data Inference for categorical variables 180
Lesson 14: The Titanic Shuffle Stress/Chill – data Chance differences - numerical 183
Lab 2F: The Titanic Shuffle Stress/Chill – data Inference for numerical variables 187
Lesson 15: Tangible Data Merging Stress/Chill – data Merging datasets 189
Lab 2G: Getting It Together Stress/Chill & Personality Color Stacking vs. joining datasets 191
Practicum:What Stresses Us? Stress/Chill & Personality Color Analyzing merged data 193
Section 4: What’s Normal? 194
Lesson 16: What Is Normal? Introduction to normal curve 195
Lesson 17: A Normal Measure of Spread Measures of spread - SD 198
Lesson 18: What’s Your Z-Score? z-scores, shuffling 201
Lab 2H: Eyeballing Normal Normal curves overlaid on distributions & simulated data 205
Lab 2I: R’s Normal Distribution Alphabet Normal probability, rnorm(), pnorm(), qnorm() 207
End of Unit 2 Project: Comparing Groups Using Our Own Data Stress/Chill, Personality Color, FoodHabits, or Time Use Synthesis of Unit 2 209
UNIT 3 Campaign Topics
Daily Overview 211
Essential Concepts 212
Section 1: Testing, Testing…1, 2, 3… 214
Lesson 1: Anecdotes vs. Data Reading articles critically, data 216
Lesson 2: What is an Experiment? Experiments, causation 219
Lesson 3: Let’s Try an Experiment! Random assignments, confounding factors 222
Lesson 4: Predictions, Predictions Visualizations, predictions 224
Lesson 5: Time Perception Experiment Elements of an experiment 226
Lab 3A: The Results Are In! Analyzing experiment data 228
Practicum: TB or Not TB Time Perception Simulation using experiment data 229
Section 2: Would You Look at That? 231
Lesson 6: Observational Studies Observational study 233
Lesson 7: Observational Studies vs. Experiments Observational study, experiment 235
Lesson 8: Monsters that Hide in Observational Studies Observational study, confounding factors 237
Lab 3B: Confound it all! Confounding factors 241
Section 3: Are You Asking Me? 243
Lesson 9: Survey Says… Survey 244
Lesson 10: We’re So Random Data collection, random samples 247
Lesson 11: The Gettysburg Address Sampling bias 251
Lab 3C: Random Sampling Random sampling 256
Lesson 12: Bias in Survey Sampling Bias in survey sampling 258
Lesson 13: The Confidence Game Confidence intervals 261
Lesson 14: How Confident Are You? Confidence intervals, margin of error 264
Lab 3D: Are You Sure about That? Bootstrapping 266
Practicum: Let’s Build a Survey! Survey design with non-leading questions 269
Section 4: What’s the Trigger? 270
Lesson 15 Ready, Sense, Go! Sensors, data collection 271
Lesson 16: Does it have a Trigger? Survey questions, sensor questions 274
Lesson 17: Creating Our Own PS Campaign Participatory Sensing campaign creation 276
Lesson 18: Evaluating Our Own PS Campaign Statistical questions, evaluate campaign 279
Lesson 19: Implementing Our Own PS Campaign Class Campaign—data Mock-implement & create campaign 281
Section 5: Webpages 283
Lesson 20: Online Data-ing Class Campaign—data Data on the internet 284
Lab 3E: Scraping Web Data Class Campaign—data Scraping data from the Internet 287
Lab 3F: Maps Class Campaign—data Making maps with data from the Internet 289
Lesson 21: Learning to Love XML Class Campaign—data Data storage, XML 291
Lesson 22: Changing Format Class Campaign—data Converting XML files 296
End of Unit 3 Project: Analyzing Our Own Campaign Data Class Campaign Statistical question, our data 299
UNIT 4 Campaign Topics
Daily Overview 301
Essential Concepts 302
Section 1: Campaigns and Community 304
Lesson 1: Trash Modeling to answer real world problems, official datasets 306
Lesson 2: Drought Exploratory data analysis, campaign creation 309
Lesson 3: Community Connection Team Campaign—data Community topic research, campaign creation 311
Lesson 4: Evaluate and Implement the Campaign Team Campaign—data Evaluate & mock-implement campaign 314
Lesson 5: Refine and Create the Campaign Team Campaign—data Revise and edit campaign, data collection 316
Section 2: Predictions and Models 317
Lesson 6: Statistical Predictions Using One Variable Team Campaign—data One variable predictions using a rule 319
Lesson 7: Statistical Predictions Applying the Rule Team Campaign—data Predictions applying MSE, MAE 321
Lesson 8: Statistical Predictions Using Two Variables Team Campaign—data Two-variable statistical predictions 325
Lesson 9: The Spaghetti Line Team Campaign—data Estimate line of best fit, linear regression 328
LAB 4A: If the Line Fits… Team Campaign—data Estimate line of best fit 330
Lesson 10: What’s the Best Line? Team Campaign—data Predictions based on linear models 332
LAB 4B: What’s the Score? Team Campaign—data Comparing predictions to real data 335
LAB 4C: Cross-Validation Team Campaign—data Use training and test data for predictions 337
Lesson 11: What’s the Trend? Team Campaign—data Trend, associations, linear model 340
Lesson 12: How Strong Is It? Team Campaign—data Correlation coefficient, strength of trend 344
LAB 4D: Interpreting Correlations Team Campaign—data Correlation coefficient, best model 347
Lesson 13: Improving Your Model Team Campaign—data Non-linear regression 350
LAB 4E: Some Models Have Curves Team Campaign—data Non-linear regression 352
Practicum: Predictions Team Campaign—data Linear regression 354
Section 3: Piecing it Together 355
Lesson 14: More Variables to Make Better Predictions Team Campaign—data Multiple linear regression 357
Lesson 15: Combination of Variables Team Campaign—data Multiple linear regression 360
LAB 4F: This Model Is Big Enough for All of Us Team Campaign—data Multiple linear regression 363
Section 4: Decisions, Decisions! 364
Lesson 16: Footbal or Futbol? Team Campaign—data Multiple predictors, classifying into groups 365
Lesson 17: Grow Your Own Decision Tree Team Campaign—data Decision trees based on training/test data 371
LAB 4G: Growing Trees Team Campaign—data Decision trees to classify observations 375
Section 5: Ties That Bind 378
Lesson 18: Where Do I Belong? Team Campaign—data Clustering, k-means 379
LAB 4H: Finding Clusters Team Campaign—data Clustering, k-means 385
Lesson 19: Our Class Network Team Campaign—data Clustering, networks 387
End of Unit 4 Project: Modeling a Community Issue
Team Campaign Synthesis of Unit 4 390