# Unit4, Section1: Predictions and Models

Instructional Days: 16

## Enduring Understandings

The regression line is a prediction machine. We give it an x-value, it gives us a predicted y-value. The regression line summarizes the trend in the data, but there may still remain variability in the dependent variable that is not explained by the independent variable. Although the regression line provides optimal predictions when the association is linear, other models are needed for when it is not linear.

## Engagement

Students will analyze a map from the Medical Daily website. The map and its article called How Twitter Can Predict Heart Disease: Negative Tweets Associated With Stress, Higher Risk Of Disease, shows a side-by-side comparison of CDC heart attack deaths data and Twitter’s predicted data. They will engage in a discussion comparing and contrasting the visualization. The map can be found at:

## Learning Objectives

Statistical/Mathematical:

S-ID 6: Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.

• a. Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear models.
• b. Informally assess the fit of a function by plotting and analyzing residuals.
• c. Fit a linear function for a scatter plot that suggests a linear association.

S-ID 7: Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.

S-ID 8: Compute (using technology) and interpret the correlation coefficient of a linear fit.

S-IC 6: Evaluate reports based on data.* *This standard is woven throughout the course. It is a recurring standard for every unit.

Focus Standards for Mathematical Practice for All of Unit 4:

SMP-2: Reason abstractly and quantitatively.

SMP-4: Model with mathematics.

SMP-7: Look for and make use of structure.

Data Science:

Judge whether or not the linear model is appropriate. Learn to interpret a correlation coefficient in a linear model and interpret slope and intercept. Evaluate the strength of a linear association. Evaluate the potential error in a linear model.

Applied Computational Thinking using RStudio:

• Use linear regression models to predict response values based on sets of predictors.

• Fit a regression line to data and predict outcomes.

• Compute the correlation coefficient of a linear model.

• Create a Participatory Sensing campaign using a campaign Authoring Tool.

Real-World Connections:

Many studies are published in which predictions are made, and media reports often cite data that make predictions. They involve one or more explanatory variable and a response variable, such as income vs. education, weight vs. exercise, and cost of insurance vs. age. Understanding linear regression helps evaluate these studies and reports.

## Language Objectives

1. Students will use complex sentences to construct summary statements about their understanding of data, how it is collected, how it used and how to work with it.

2. Students will engage in partner and whole group discussions and presentations to express their understanding of data science concepts.

3. Students will use complex sentences to write informative short reports that use data science concepts and skills.

4. Students will read informative texts to evaluate claims based on data.

## Data File or Data Collection Method

Data File:

1. LA DWP (dwp_2010)

2. Movies (movie)

Data Collection:

Students will collect data for their water usage campaign.

## Legend for Activity Icons 