# Unit 1 Vocabulary

### algorithm

a process or set of rules for solving a mathematical problem

### bimodal

a distribution which has two peaks

### bin widths

the width of the rectangle with shows data is graphed in groups on the x-axis

### bin(s)

a bar whose height corresponds to how many data points are in that bin

### campaign

gather and collect data

### categorical variables

values that have words

### center

useful for numerical variables, the center of the distribution often corresponds to our notion of ‘typical value’

### claim

a statement of something

### collect

the process of gathering and measuring information

### columns

a structured data item in a table

### conditional relative frequency

the ratio of a joint relative frequency and related marginal relative frequency

### console

a pane within RStudio; the place where RStudio is waiting for you to tell it what to do, and where it will show the results of a command; you type your codes directly into the console

### data

Data are information, or observations, that have been gathered and recorded

### data analysis

tables, graphs, and summaries of the data that are produced to help us find patterns and relationships

### data collection

the process of observing and recording data, or of examining previously collected data to make sure it meets the needs of an investigation

### data cycle

a guide we can use when learning to think about data

### data interpretation

the statistical questions are answered by referring to the tables, graphs, and summaries made in the Data Analysis phase

### data point

a single fact or piece of information

### data set(s)

a collection of data

### data table

arrangement of data

### data trails

the data collected about us as individuals that could be used to see the patterns in our personal lives

### distribution

a function or a listing which shows all the possible values

### dotplot

a graphical display of data using dots

### environment

a pane within RStudio; where values and objects can be viewed

### ethics

a code of behavior, specifically what is right and wrong

### evaluate

to think carefully

### frequency

the number of times an outcome occurs

### GPS

stands for Global Positioning System; it is a radio navigation system that allows land, sea, and airborne users to determine their exact location

### grouping

when the data are split into categories

### histogram

an approximate representation of the distribution of numerical data

### images

a representation of the external form of a person, thing, or picture

### input

the value you place into the algorithm

### joint (relative) frequency

a fraction that tells you how many members of a group have a particular characteristic

### left-hand rule

when multiple data points can appear in more than one bin, observations would go in the bin on the left-hand side

### left-skewed

the mean is typically less than the median; the tail of the distribution is longer on the left-hand side than on the right-hand side

### marginal (relative) frequency

the margins on the table that show the cells with the initial total counts

### maximum

the largest value

### minimum

the smallest value

### numerical variables

values that have numbers

### observations

Data that have been gathered and recorded

### organize

the method of classifying and organizing data sets to make them more useful

### output

the value(s) that are produced by an algorithm

### pane

a rectangular area within RStudio

### participatory sensing

an approach to data collection and interpretation in which individuals, acting alone or in groups, use their personal mobile devices and web services to explore interesting aspects of their worlds ranging from health to culture

### photo ethics

the principles that guide how we take and share photographs

### plot

a pane within RStudio; where plots/graphs/visualizations will be generated

### preview

a pane within RStudio; (spreadsheet) - where they will be able to see the variables and observations (index); rows and columns of data

### privacy

the right of individuals to have control over how their personal information is collected and used

### range

the largest value minus the smallest value

### record

a collection of data

### rectangular or spreadsheet format

information that is stored in a rectangular or spreadsheet format

### representations

the form in which data are stored, processed, and transmitted

### right-hand rule

when multiple data points can appear in more than one bin, observations would go in the bin on the right-hand side

### right-skewed

the mean is typicallygreater than the median; the tail of the distribution is longer on the right-hand side than on the left-hand side

### rows

a structured data item in a table

### scatterplot

a plot that uses dots to represent values for two different numeric variables

### shape

the placement of points in a distribution

### side-by-side bar plot

a plot where the bars are split into colored bar segments, used to compare things between different groups or to track changes over time

### spread

how dense the distribution is at certain values

### statistical investigative questions

questions that address variability and can be answered with data

### surveys

a research method used for collecting data to gain information and insights into various topics of interest

### symmetric

a type of distribution where the left side of the distribution mirrors the right side

### two-way frequency table

a table that displays the data that pertains to two categories from one group

### typical

“mean” or “average”; expected values

### unimodal

a distribution which has a single peak

### variability

how spread out a set of data is; variability gives you a way to describe how much data sets vary and allows you to compare your data to other sets of data

### variables

characteristics of an object or person

### visualization

a picture of the data

### x-axis

horizontal axis of a coordinate plane

### y-axis

vertical axis of a coordinate plane