Lab 3C: Random Sampling
Lab 3C - Random Sampling
Directions: Follow along with the slides and answer the questions in bold font in your journal.
Learning by sampling
In many circumstances, there's simply no feasible way to gather data about everyone in a population.
– For example, the Department of Water & Power (DWP) wants to determine how much water people in Los Angeles use to take a shower. They've created a survey to pass out to collect this information.
– Write down two reasons why getting everyone in Los Angeles to fill out the survey would be difficult. Also, write a sentence why the DWP might consider using a sample of households instead.
In this lab, we'll learn how sampling methods affect how representative a sample is of a population.
Loading a population
In previous labs, we used the
cdcdata as a sample for young people in the United States.
– In this lab, we'll consider these survey respondents to be our population.
Rand fill in the blanks to take a convenience sample of the first 50 people in the data:
s1 <- slice(____, 1:____)
Why do you think we call this method a convenience sample?
Comparing your convenience sample
- A convenience sample is a sample from a population where we collect data on subjects because they're easy-to-find.
Using your convenience sample, create a
bargraphfor the number of people in each
– Do you think the distribution of
gradefor your sample would look similar when compared to the whole
– Which groups of people do you think are over or under represented in your convenience sample? Why?
– Compare the distributions of the
cdcdata and your convenience sample and write down how they differ.
Fill in the blanks below to create a sample by randomly selecting 50 people in the
cdcdata, without replacement. Call this new sample
___ <- sample(___, size = ___, replace = ___)
Write a sentence that explains why you think the distribution of
gradefor this random sample will look more or less similar to the distribution from the whole
– Create a
gradebased on this random sample to check your prediction.
Increasing sample size
gradebased on each of the following sample sizes: 10, 100, 1,000, 10,000.
– Compare each distribution to that of the population.
How do the distributions change as the size of the sample increases? Why do you think this occurs?
tally()the proportion of
grades for your convenience sample and all your random samples.
– Which set of proportions looks most similar to the proportions of the population?
The mean, or proportion, from a random sample might not always be closer to that of the true population when compared to a convenience sample.
However, as sample sizes get larger:
– Random samples will tend to be better estimates for the population.
– With convenience samples, this might not be the case.
Write down a reason why estimates based on convenience samples might not improve even as sample size increases.