Lesson 4: The Data Cycle
Lesson 4: The Data Cycle
Objective:
Students will learn about the Data Cycle and will understand what a statistical question is.
Materials:

The Data Cycle file (LMR_1.3_Data Cycle)

Computer, projector, or board and markers/chalk

The Data Cycle Spinners handout (LMR_1.4_Data Cycle Spinners)

Cardstock paper

Scissors

Brass Brads

Bros & Dudes Graphics handout (LMR_1.5_Bros & Dudes Graphics)
Vocabulary:
data cycle, statistical questions, data collection, data analysis, data interpretation
Essential Concepts:
Essential Concepts:
A statistical investigation consists of cycling through the four stages of the Data Cycle. Statistical questions are questions that address variability and are productive in that they motivate data collection, analysis, and interpretation. The Data Collection phase might consist of collecting data through Participatory Sensing or some other means, or it might consist of examining previously collected data to determine the quality of the data for answering the statistical questions. Data Analysis is almost always done on the computer and consists of creating relevant graphics and numerical summaries of the data. Data Interpretation is involved with using the analysis to answer the statistical questions.
Lesson:

During the past few lessons, we have discussed what data are, how to collect and organize them, and how their values can vary. But what do we do with all this data? How can we navigate it and turn it into something useful to us?

Inform students that they will be learning about the Data Cycle today. The Data Cycle is a guide we can use when learning to think about data. We always start with asking questions. Display the graphic from The Data Cycle file (LMR_1.3):

Display the Data Cycle on the board or on a projector, and give a brief explanation of the 4 components (listed below).
Note: we will explore each component of the Data Cycle more explicitly throughout the course.

Statistical Questions: Statistical questions are questions that address variability and can be answered with data.

Data Collection: This is the process of observing and recording data, or of examining previously collected data to make sure it meets the needs of the investigation.

Data Analysis: During analysis, tables, graphs, and summaries of the data are produced to help us find patterns and relationships.

Data Interpretation: The statistical questions are answered by referring to the tables, graphs, and summaries made in the Data Analysis phase.


To help students get a firm understanding of the Data Cycle and how each component is connected, they will create a Data Cycle Spinner.

Distribute the following:

Copies (on cardstock paper) of the Data Cycle Spinners handout (LMR_1.4) for each team. If there are more than 4 members in a team, 2 copies of the handout should be distributed.

Plain cardstock paper, cut into fourths, for each team. Each student will use one of the fourths.

One brad per student.


Instruct the students to create their very own spinner (see example below).

At the top of the plain cardstock paper, they should draw a downward pointing arrow. This will denote where their spinner lands in the Data Cycle.

Place the spinner onto the plain cardstock paper (below the arrow) and pierce both papers through the center black circle on the spinner.

Insert the brad through both pieces of cardstock and secure it behind the plain paper.


Almost all statistical investigations begin with statistical questions. There are times when the questions may be given to us, so we might start at the data collection step, but this should be our starting point. No matter where we start, our goal is to get all the way around the cycle at least once. If we can't, we should try to change our approach so that we can get around the entire cycle.

As an example, explain that you might ask a person "How old are you?" Use the steps below to demonstrate the spinner:

This is a question, so we point the spinner to "Statistical Questions."
Note: This is not a statistical question, so we're going to see why not.

The person would give an answer – their age. This is data (one observation), so we spin the pointer to "Data Collection."

This is all the data we need to answer the question, so we spin the Data Cycle to “Data Analysis” to see if we could create some sort of graph or table to display this information.

However, we only have one data value, so it would be impossible to create anything from it (and we don't need to, because we have everything we need to know in order to answer the original question).

Therefore, we cannot complete the “Data Analysis” component and are not able to move on to “Data Interpretation.”

We need to move backwards through the cycle to figure out what went wrong. In this case, what went wrong is that our question was not statistical. It didn’t address variability.


So let's change our question to a statistical one. Pose the question: "How old are the students in our class?"

Next we spin to “Data Collection.” Our data would consist of all of the students in the class. There would be different values: probably ages between 14 and 19.

We then spin to “Data Analysis”. Is there a graph or a summary we might make? YES! We might make a dotplot of the ages, for example.
Note: Do not create a dotplot.

Then we spin to "Data Interpretation." We might use the graph from the analysis to say, "Students are many different ages, but the typical age is..."

From here we could spin to “Statistical Questions” and try to come up with additional questions to answer and repeat the Data Cycle again.


Distribute the Bros & Dudes Graphics handout (LMR_1.5) to each team. There are 10 different versions of word pairings (10 combinations of 2 words chosen from the 5 options), so multiple teams will have the same graphic if there are more than 10 teams in a class.

Inform the students that the graphics shown on their handouts were created for the Quartz website by Nikhil Sonnad as a data visualization. He collected the data via Twitter. The graphics show how common certain terms are throughout the United States when referring to friends. The goal of this activity is for each team to come up with 2 questions that could be asked given the particular graphics. For example, a team given the “Bro” and “Buddy” graphics might come up with the following questions: Which region of the US is most likely to use the term “Bro” when referring to a friend? Do the coastal areas prefer different terms than the Midwest? Is there a difference between northern states versus southern states?

In their DS journals, each student should write down the 4 components of the Data Cycle, then use their spinners to see if they can make it all the way through with their chosen questions. They should record notes about whether or not each component can be completed, and why.

The teams should create a small Data Cycle graphic using ONE of their questions to turn in for assessment (suggested size is 8.5”x11”). The cycle should be clearly labeled and have appropriate responses for each of the 4 components.

Each team’s graphic must make a conclusion about why their question was or was not a statistical question.


If they have questions that are not statistical, have them reformulate them into statistical questions for homework.
Class Scribes:
One team of students will give a brief talk to discuss what they think the 3 most important topics of the day were.
Homework
Students reformulate any nonstatistical questions generated by their team about the Bros & Dudes Graphics handout into statistical questions.