Unit4, Section5: Ties that Bind
Instructional Days: 3
Enduring Understandings
Clustering is another way to classify data into groups. We classify observations based on numerical characteristics and their similarities. We use kmeans to determine the mean value for each group of k clusters by randomly assigning an initial value for the mean and then moving the mean based on its proximity to the points.
Networks classify people into groupings based on who knows whom. Nodes are formed when a relationship between two people is present.
Engagement
Students will determine which points in a plot should be grouped as football players and which points should be grouped as swimmers based on clustering of characteristics.
Learning Objectives
Statistical/Mathematical:
SIC 2: Decide if a specified model is consistent with results from a given datagenerating process, e.g., using simulation.
Data Science:
Understand what RStudio is doing when using the kmeans function to find clusters in a group of data and when creating networks in order to learn how to classify data into groups.
Applied Computational Thinking using RStudio:
• Use the kmeans function to find clusters in a group of data.
• Plot the data with the cluster assignments based on the kmeans function.
RealWorld Connections:
Network analysis is used by many private and public entities such as the National Security Agency when they want to find terrorist networks to have maximum impact on communications. The kmeans algorithm is a technique for grouping entities according to the similarity of their attributes. For example, dividing countries into similar groups using kmeans to make fair comparisons is applicable.
Language Objectives

Students will write, in their own words, an explanation of kmeans clustering.

Students will describe the differences between time spent on videogames and time spent on homework, from their own class data.

Students will create visualizations and numerical summaries to explain and justify, orally and in writing, a recommendation to better their community.
Data File or Data Collection Method
Data File:

USMNT and NFL:
data(titanic)

Students' TimeUse campaign data
Data Collection:
Students will collect data for their Team Participatory Sensing campaign.
Legend for Activity Icons