Unit 3 Vocabulary
algorithm
a process or set of rules that are followed
anecdote
stories that someone tells about his/her own experience or the experience of someone he/she knows
associated
joined together, often in a working relationship
bootstrapping
where we take random samples of really large samples
cause
a reason for an action or condition
closed-ended questions
give a fixed set of choices
confidence interval
an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data
confounding factors
an “extra” variable that you didn’t account for
control group
the group that does not receive a treatment
cost limitations
the limitation of funds or money
data
information, or observations, that have been gathered and recorded
data farm
a physical space where high capacity servers are placed to store large amounts of data
ethics
a system of moral principles
experiment
one method of data collection; something that can be repeated that has a set of possible results
feasibility
how easy or difficult it is to do something
HTML (Hyper Text Markup Language)
a standardized system for tagging text files to achieve font, color, graphic, and hyperlink effects on web pages
inferences
the process of drawing conclusions about an underlying population based on a sample or subset of the data
interval
a data type which is measured along a scale, in which each point is placed at equal distance from one another
margin of error
tells you how many percentage points your results will differ from the real population value
observational study
a data collection method in which subjects are observed and outcomes are recorded
open-ended questions
offer a free-response/text approach
outcome
the variable that the treatment is meant to influence; this is sometimes known as the response, or dependent, variable
over-represented
represented excessively especially; having representatives in a proportion higher than the average
parameter
any number that summarizes a population
Participatory Sensing
an approach to data collection and interpretation in which individuals, acting alone or in groups, use their personal mobile devices and web services to systematically explore interesting aspects of their worlds ranging from health to culture
population
consists of all of the people we want to learn something about
random assignment
subjects are randomly assigned to either the treatment or control group
random sample
a sample that is chosen randomly
random sampling
a sample that is chosen randomly
representative sample
a subset of a population that seeks to accurately reflect the characteristics of the larger group
research question
the question to be answered by the experiment
sample
people (or objects) that are selected from the population
sampling bias
occurs when the resulting samples tend to produce results that are influenced in one particular direction
self-reported
when participants answer questions themselves
sensor
a converter that measures a physical quantity and converts it into a signal, which can be read by an observer or by an instrument
statistic
a term used for numbers that summarize a sample
subjects
people or objects that are participating in the experiment
survey
an investigation about the characteristics of a given population by means of collecting data from a sample of that population and estimating their characteristics through the systematic use of statistical methodology
survey sample
people who are asked to participate in a survey
tags
the variable names are stored at the beginning of the code, in between <th>
and </th>
theory
an idea used to explain a situation
treatment
the variable that is deliberately manipulated to investigate its influence on the outcome; this is sometimes known as the explanatory, or independent, variable
treatment group
the group of subjects that receive the treatment
trigger
something that responds to an event so that an action can occur
under-represented
a subset of a population that holds a smaller percentage within a significant subgroup than the subset holds in the general population
XML (Extensible Markup Language)
a popular format for storing data on the internet; it creates readable web pages, and also because it allows programmers to easily update values in the data table if those values change