# Essential Concepts

**IDS Unit 1: Essential Concepts**

__Lesson 1: Data Trails__

__Lesson 1: Data Trails__

Data are a collection of recorded observations. Data are gathered by people and by sensors. Patterns in data can reveal previously unknown patterns in our world. Data play a large, and sometimes invisible, role in our lives.

__Lesson 2: Stick Figures__

__Lesson 2: Stick Figures__

Data consist of records of particular characteristics of people or objects. Data can be organized in many different ways, and some ways make it easier than others for achieving particular purposes.

__Lesson 3: Data Structures__

__Lesson 3: Data Structures__

Variables record values that vary. By organizing data into rectangular format, we can easily see the characteristics of observations by reading across a row, or we can see the variability in a variable by reading down the column. Computers can easily process data when it is in rectangular format.

__Lesson 4: The Data Cycle__

__Lesson 4: The Data Cycle__

A statistical investigation consists of cycling through the four stages of the Data Cycle; statistical questions are questions that address variability and are productive in that they motivate data collection, analysis, and interpretation. The Data Collection phase might consist of collecting data through Participatory Sensing or some other means, or it might consist of examining previously collected data to determine the quality of the data for answering the statistical questions. Data Analysis is almost always done on the computer and consists of creating relevant graphics and numerical summaries of the data. Data Interpretation is involved with using the analysis to answer the statistical questions.

__Lesson 5: So Many Questions__

__Lesson 5: So Many Questions__

Statistical investigative questions typically begin with a vague general question, then develop into a precise question. The process of developing or creating a good investigative question is iterative and requires time and effort to get right.

__Lesson 6: What Do I Eat? [The Data Cycle: Consider Data]__

__Lesson 6: What Do I Eat? [The Data Cycle: Consider Data]__

After raising statistical questions, we examine and record data to see if the questions are appropriate.

__Lesson 7: Setting the Stage [The Data Cycle: Collect Data]__

__Lesson 7: Setting the Stage [The Data Cycle: Collect Data]__

In Participatory Sensing, we humans behave as if we are robot sensors, collecting data whenever a "trigger" event occurs. Our ability to learn about the patterns in our life through these data depends on our being reliable data collectors.

__Lesson 8: Tangible Plots [The Data Cycle: Analyze Data]__

__Lesson 8: Tangible Plots [The Data Cycle: Analyze Data]__

Distributions organize data for us by telling us (a) which values of a variable were observed, and (b) how many times the values were observed (their frequency).

__Lesson 9: What Is Typical?__

__Lesson 9: What Is Typical?__

The “center” of a distribution is a deliberately vague term, but it is one way to answer the subjective question "what is a typical value?" The center could be the perceived balancing point or the value that approximately cuts the area of the distribution in half.

__Lesson 10: Making Histograms__

__Lesson 10: Making Histograms__

Histograms can be created through the use of an algorithm. The distributions displayed in a histogram can be classified using the technical terms for the shapes of distributions. Learning to describe routine tasks through an algorithm is an important component of computational thinking.

__Lesson 11: What Shape Are You In?__

__Lesson 11: What Shape Are You In?__

Identifying the shape of a histogram is part of the **interpret** step of the Data Cycle.

__Lesson 12: Exploring Food Habits__

__Lesson 12: Exploring Food Habits__

Once Participatory Sensing data has been collected, the Dashboard and PlotApp perform the analysis step of the Data Cycle, though humans need to tell the computer which plots to examine.

__Lesson 13: RStudio Basics__

__Lesson 13: RStudio Basics__

The computer has a syntax, and it can only understand if you speak its language.

__Lesson 14: Variables, Variables, Variables__

__Lesson 14: Variables, Variables, Variables__

To examine whether two (or more) variables are related, we can plot their distributions on the same graph.

__Lesson 15: Americans’ Time on Task__

__Lesson 15: Americans’ Time on Task__

Learning to examine other analyses is an important part of statistical thinking.

__Lesson 16: Categorical Associations__

__Lesson 16: Categorical Associations__

A two-way table is a summary of the association/relationship between two categorical variables.
Joint relative frequencies answer questions of the form "what proportion of the people/objects had
*this* value on the first variable and *this* value on the second?"

__Lesson 17: Interpreting Two-Way Tables__

__Lesson 17: Interpreting Two-Way Tables__

Marginal (relative) frequencies tell us about the distribution of a single variable. Conditional relative frequencies tell us about the distribution of one variable when "subsetting" the other.