# Unit4, Section5: Ties that Bind

Instructional Days: 3

## Enduring Understandings

Clustering is another way to classify data into groups. We classify observations based on numerical characteristics and their similarities. We use k-means to determine the mean value for each group of k clusters by randomly assigning an initial value for the mean and then moving the mean based on its proximity to the points.

Networks classify people into groupings based on who knows whom. Nodes are formed when a relationship between two people is present.

## Engagement

Students will determine which points in a plot should be grouped as football players and which points should be grouped as swimmers based on clustering of characteristics.

## Learning Objectives

Statistical/Mathematical:

S-IC 2: Decide if a specified model is consistent with results from a given data-generating process, e.g., using simulation.

Data Science:

Understand what RStudio is doing when using the k-means function to find clusters in a group of data and when creating networks in order to learn how to classify data into groups.

Applied Computational Thinking using RStudio:

• Use the k-means function to find clusters in a group of data.

• Plot the data with the cluster assignments based on the k-means function.

Real-World Connections:

Network analysis is used by many private and public entities such as the National Security Agency when they want to find terrorist networks to have maximum impact on communications. The k-means algorithm is a technique for grouping entities according to the similarity of their attributes. For example, dividing countries into similar groups using k-means to make fair comparisons is applicable.

## Language Objectives

1. Students will write, in their own words, an explanation of k-means clustering.

2. Students will describe the differences between time spent on videogames and time spent on homework, from their own class data.

3. Students will create visualizations and numerical summaries to explain and justify, orally and in writing, a recommendation to better their community.

## Data File or Data Collection Method

Data File:

1. USMNT and NFL: `data(titanic)`

2. Students' TimeUse campaign data

Data Collection:

Students will collect data for their Team Participatory Sensing campaign.