section name header

Glossary

ABCDEFGHIKLMNOPQRSTUVWZ


A

Alternate hypothesis Usually denoted by H1 or Ha, the hypothesis that observations from a sample are influenced by a nonrandom element; the hypothesis the researcher is interested in.

Analysis of covarianceAlso known as ANCOVA; a combination of analysis of variance (ANOVA) and regression analysis that checks if the population means for a dependent variable are equal across an independent variable, while controlling for the presence of a covariate (or covariates).

Analysis of varianceAlso known as ANOVA; a statistical test that checks if the means for several groups are equal. It is used as a way to avoid the increasing probability of a Type I error that comes with running multiple t-tests.

AssociationA relationship between the variables being studied; when a change in one variable is related to a change in another variable.

Assumption of sphericityAn assumption of repeated measures ANOVA that the difference scores of paired levels of the repeated measures factor have equal variance.

B

Bar chart A graphical representation of data, which is most useful for data at the nominal or ordinal level of measurement; the data categories are on the horizontal axis, whereas the frequencies of each category are on the vertical axis.

Bimodal distributionA distribution in which there are two modes.

Bonferroni testA method used to correct for Type I errors that can arise from multiple comparisons.

BoxplotA chart that represents the distribution of data values; it also illustrates the quartiles and any outliers.

Boxs M testA method used to test the homogeneity of covariance matrices.

C

Categorical data Data made up of categorical variables, which are variables measured at the nominal or ordinal level.

Categorical variablesA variable that can be counted and has a finite and fixed number of possible values (i.e., every value is assigned to a particular group or category); variables measured at the nominal and ordinal level.

CausalityWhen a change in one variable is known to produce an effect or change in another variable.

Central tendencyThe propensity for quantitative data to cluster around a center value; measures of typical or average values in a set of data points.

Chi-square testA nonparametric test used to determine whether an actual distribution of categorical data values differs from the expected distribution.

Clinical significanceUsually measured as effect size; may be used to determine the magnitude of impact of an intervention; useful for evaluating a clinical practice.

CodebookThe window in SPSS that allows you to define the characteristics of your variables before data entry.

Coefficient of determinationA measure of the amount of variability in one (dependent) variable that can be explained by the other (independent) variable.

Confidence intervalThe range of values within which an estimated point would be expected to fall.

Confidence levelThe level of assurance a researcher has that the data from a study/studies represent true values.

Confounding variableAny uncontrolled variable that may influence the outcome of a study.

ConstructAn idea or concept of interest.

Construct validityThe degree to which an instrument or tool measures the specific idea of interest.

Content validityThe degree to which a measurement tool captures the elements of the concept of interest.

Contingency tablesA table used to display the frequency distributions of variables, often used to study the relationship between two or more categorical variables.

Continuous variableA variable that has an infinite number of possible values (i.e., every value on a continuum) or the infinite number of values between two consecutive values; variables measured at the interval and ratio level.

CorrelationA standardized measure of the strength and direction of the relationship between two variables.

CovarianceA measure of how two variables are related to each other, ranging from negative infinity to positive infinity; a covariance of zero indicates that there is no relationship between the variables.

CovariateA variable that influences the dependent variable, but is not the independent variable (i.e., not the variable of interest). Also known as a covariable.

Criterion-related validityThe degree to which measurements from one tool or instrument may be correlated with measurements from other valid and reliable instruments.

Crosstab analysisUsing a contingency table to study the relationship(s) between variables and focus in on the most significant relationships.

D

Data The values of variables.

Data analysis menusThe menus in SPSS that allow you to create statistical outputs; the Analyze and Graph menus.

Data cleaningInspecting and correcting a data set to ensure that the data are complete and free of errors before analysis.

Data definition menusThe menus in SPSS that allow you to add or change data; the Data and Transform menus.

Data fileThe Excel or SPSS program files that contain the data values being studied (extension .xls or .xlsx for Excel and .sav for SPSS).

Data setA collection of data.

Degrees of freedomA measurement of the opportunities for variability in a given statistical calculation.

Dependent groupA group where there can be multiple values from a single source.

Dependent sample t-testData analysis that studies mean differences when the measurements of a given dependent variable are paired.

Dependent variableAn outcome variable that is affected or influenced by the independent variable.

Descriptive statisticsStatistics that summarize the data from a sample, instrument, or scale such as central tendency and variation.

Discrete variableAlso known as categorical variable; a variable that can be counted, but has a finite number of countable categories; variables measured at the nominal and ordinal level of measurement.

E

Effect When changes in the independent variable result in changes in the dependent variable.

Effect sizeThe measure of the magnitude of the relationship or difference between groups; often used as a measure of the efficacy of an intervention or treatment.

EfficacyThe effectiveness of a given intervention.

Enter methodThe default method in regression analysis, when all of the independent variables are fitted into the regression model at the same time.

Evidence-based practiceUtilizing data from reliable, scientific studies in combination with clinical judgment and patient preferences to determine the best course of action.

External validityThe level of confidence in whether or not the results of a study may be generalized from the sample to the target population.

F

Factorial analysis of variance Data analysis that studies the effects of two or more independent variables (factors) on the dependent variable.

Fishers exact testA method used to test the relationship between categorical values in instances where the sample size is too small to use the chi-square test.

Frequency distributionA method for presenting data that includes possible values for a given variable and the number of times each value is present.

Friedmans ANOVAA nonparametric counterpart of repeated measures ANOVA, when measurements are repeated more than two times, that can be used when the assumptions are violated or when the sample size is too small.

F-statisticA test used to determine if the means of normally distributed populations are equal.

G

Generalizability The accuracy with which results from a sample can be extrapolated to encompass the population as a whole.

Goodness of fitA measure of how well a model fits a set of observations.

Group comparisonsComparing group values, as opposed to individual values.

H

Hierarchical method A method in regression analysis that utilizes blocks of independent variables (chosen based on importance), added one at a time, to see if there is any change in the predictability.

HistogramA visual method for presenting data that is similar to a bar chart, but instead groups data points into intervals, rather than individual categories; most useful for showing the distribution of continuous data.

Hypothesis testingExamining data to determine whether there is sufficient evidence to reject the null hypothesis.

I

Independent group A group where the values cannot overlap with the values of the comparison group.

Independent sample t-testA test of differences between means used when there are two independent groups to be compared.

Independent variableThe variable that influences another variable(s). Also, the variable that the investigator controls or manipulates to affect the dependent variable.

Inferential statisticsStatistics that allow a researcher to generalize about a population based on the results from the sample.

InstrumentSee Tool.

Internal consistencyA measure of whether or not items in the same test, which purport to measure the same variable, are related.

Internal validityThe degree to which changes on the dependent variable may be attributed to the independent variable. Strongly influenced by the quality of the study and control of confounding variables.

Interquartile rangeA measure of variability; the difference between the values at the 75th percentile and the 25th percentile of a data set.

Interrater reliabilityThe ability of a test or scale to provide consistent values when used by different people.

Interval level of measurementData are classified into categories with rankings and are mutually exclusive as in the ordinal level of measurement. In addition, specific meanings are applied to the distances between categories. These distances are assumed to be equal and can be measured.

K

KruskalμWallis test A nonparametric method of analysis used to compare more than two independent samples to determine if the samples are from the same distribution.

KurtosisA measure of the peak of a distribution.

L

Levels of evidence A ranking system used to determine the quality and strength of results from differing types of research studies.

Levels of measurementThe four different scales of measurement, used to differentiate types of data (and the statistical procedures appropriate for the data). They are nominal, ordinal, interval, and ratio.

Line chartA visual representation of data that is useful for following changes over time or for finding patterns in the data.

LinearGenerally referring to the relationship of one variable to another, which resembles a line.

LinearityA statistical term that is used to represent a mathematical relationship and graphically shown as a straight line.

Logistic regressionA type of regression analysis that predicts a group membership in a categorical dependent variable with independent variables, which are usually continuous but can be categorical as well; called binary logistic regression when the number of categories of the dependent variable is two and multinomial logistic regression if a dependent variable has more than two categories.

M

MannμWhitney test A nonparametric method of analysis used to compare two independent groups to determine if the samples come from the same distribution.

MeanA measure of central tendency; the arithmetic average of all values in a data set.

Measurement errorThe difference between the true value of a variable and the value that has been measured.

MedianA measure of central tendency; the exact middle value (when ordered consecutively) of a data set.

Method of least squaresAn approach used in regression analysis to find the line that best fits the data with the fewest residuals.

ModeA measure of central tendency; the most frequently occurring number in a data set.

MulticollinearityA high correlation/relationship (over .85) between independent variables with one potentially being linearly predicted from the others (i.e., no added information explained).

Multimodal distributionA distribution in which there are three or more modes.

MultivariateWhen a design examines two or more dependent variables.

Multivariate analysis of covarianceAlso known as MANCOVA; an extension of analysis of covariance that is used in cases where there is one or more dependent variables and there is a covariate(s) that needs to be controlled.

Multivariate analysis of varianceAlso known as MANOVA; an extension of analysis of variance that examines group differences on a combination of multiple dependent variables.

N

Nominal level of measurement Data are classified into mutually exclusive categories where no ranking or ordering is imposed on categories.

NonparametricAny tests or statistics that do not rely on an assumption of a normal distribution.

Nonrandom missing dataWhen the data that are missing appear to follow a specific pattern.

Nonrandom samplingThe selection of members from a population based on something other than chance, often used to make a study more feasible or when the population of interest is difficult to access. Types include convenience sampling, volunteer sampling, quota sampling, and snowball sampling.

Normal distributionA distribution of data in which the data values are equally distributed around the center data point. The bell curve is a normal distribution.

Null hypothesisDenoted as H0; the hypothesis that suggests there will be no statistically meaningful effect on the variable(s) being studied.

O

Odds ratio A descriptive statistic used in categorical data analysis that measures effect size (the strength of the association between two binary data values).

Omega squaredThe effect size for one-way ANOVA results.

One-sampleThe simplest t-test; a one-sample t-test compares the mean score of a sample on a continuous dependent variable to a known value.

One-tailed testA test of significance that looks for an effect in a particular direction (positive or negative).

Ordinal level of measurementData are classified into mutually exclusive categories; and, ranking or ordering is imposed on categories.

Orthogonal planned contrastsA type of a priori test; comparisons that are planned before analysis of data has begun because certain results are expected. Orthogonal planned contrasts help reduce Type I error inflation that comes from multiple comparisons.

OutlierAny data value that is outside of the expected range of values.

Output fileThe SPSS files that contain the results of the statistical analyses, along with any error or warning messages (extension is .spv).

P

Parameter A characteristic of a population.

ParametricAny tests or statistics that assume a normal distribution in the data values.

Parametric testsAny tests or statistics that assume a normal distribution in the data values.

Partial correlation coefficientA measurement that allows us to look at the true relationship between two variables after controlling for an unwanted variable that may be affecting the relationship.

Percentage of varianceProportion of variation in a given data set explained by independent, mediating, and/or moderating variables.

PercentileWhere a data point falls within the data set; specifically, how many data values fall above or below a specific point.

Phi and Cramers VStatistics that report the strength of an association between two categorical variables.

Pie chartA circular chart in which the sections are proportionally representative of the frequencies of specific values of the given variable. It is most useful for the nominal and ordinal levels of measurement.

Point estimatesSingle values computed from sample data.

PopulationAll the members of a group of interest.

Post hoc testsComparisons made to data after analysis to determine which means are contributing the greatest amount of variance.

Power analysisA procedure used to calculate the minimum sample size required to be able to detect meaningful results based on effect size, or to calculate the level of power, given a sample size.

Predictor variableAnother name for the independent variable; see independent variable.

ProbabilityLikelihood that something will happen.

Process improvementMethods to optimize an organizations processes to achieve goals more efficiently and effectively.

Q

Qualitative variable Variables whose data values are nonnumeric.

Quality improvementMethods to improve the quality of product/service in order to provide the best customer/patient satisfaction.

Quantitative variableVariables whose data values are numeric.

R

Random errors Errors that occur by chance and are the result of unknown causes.

Random missing dataWhen the data that are missing do not appear to follow any sort of pattern.

Random samplingThe selection of members from a population based solely on chance; all members of the population have equal likelihood of being selected. Types include simple random sampling, systematic random sampling, stratified random sampling, and cluster sampling.

RangeA measure of variability; the difference between the largest and smallest values in a data set.

Ratio level of measurementAll characteristics of the interval level of measurement are present; in addition, there is a meaningful zero, and ratio or equal proportion is present.

Regression modelThe model created via regression analysis.

Relative riskA descriptive statistic that measures the probability of an event occurring if exposed to a specific risk factor.

ReliabilityThe measure of whether or not a test is able to consistently measure a given variable.

Repeated measures analysis of varianceA statistical test that checks if the means of two or more measures of a variable from the same group either over different treatments or time periods are equal.

ResidualThe difference between the observed data and the data fitted to the regression model. May also be thought of as unexplained variance.

R-square (R2)A statistical measure of how well a regression line approximates real data points. Also, a measure of the proportion of variation in a given data set explained by independent, mediating, and/or moderating variables.

S

Sample A subset of a population under investigation; the results from research on the sample are used to generalize to the population as a whole.

SamplingThe act of selecting a sample from a population.

Sampling distributionThe distribution of a given statistic (e.g., mean) as derived from all possible samples of a population.

Sampling errorThe discrepancy between a statistic computed from a sample and the actual but unknown parameter from the population.

ScatterplotA visual representation of the relationship between two continuous variables.

Scheffe test:A post-hoc test used to find which specific group means are different

Simple effect analysisStatistical analysis that examines the effect of one variable at every level of the other variable, to confirm if the effect is present at each level.

SkewedWhen the datas mean is pulled toward one tail or the other; an absence of normal distribution in a data set.

SkewnessA measure of how symmetrical a distribution is.

Sphericity-assumed statisticsRepeated measures design statistics provided if the assumption of sphericity is not violated.

Standard deviationThe average amount that data values will vary from the mean; the square root of the variance.

Standard normal distributionA distribution in which the scales have been standardized to be able to compare different distributions; the mean is equal to 0 and the variance is equal to 1.

Statistical inferenceA process of making inferences about a population based on statistics computed from a sample drawn from the population.

Statistical meaningfulnessThe degree of a statistics applicability to the real practice.

Statistical powerThe probability of correctly rejecting the false null hypothesis.

StatisticsThe characters of a sample; the process of analyzing data for the purpose of drawing conclusions or making inferences, especially from a sample to a population.

Stem and leaf plotA visualization of continuous data that shows both frequency distribution and information on individual data values.

Stepwise methodsMethods used in regression analysis where variables are added/removed to the model based on predetermined statistical criteria. The three types of stepwise methods are forward selection, backward selection, and stepwise selection.

Syntax fileThe SPSS files that contain programmable commands used to generate analyses beyond those available in the interactive windows (extension is .sps).

Systematic errorErrors that occur consistently because of known causes. One common source of systematic error is the incorrect use of tools or instruments.

T

Testμretest reliability The measure of a tests ability to consistently provide the same measurements across time.

ToolA device for measuring variables.

t-testA method for comparing two means.

Tukey test:A post-hoc test used to find means that are substantially different from each other; probably the most flexible test.

Two-tailed testA test of significance that looks for an effect without concern as to the direction (positive or negative) of the effect.

Type I errorWhen the null hypothesis is rejected by mistake; the probability of rejecting a true null hypothesis.

Type II errorWhen the null hypothesis is not rejected by mistake; the probability of not rejecting a false null hypothesis.

U

Unimodal distribution A distribution in which there is only one value designated as the mode.

UnivariateWhen a design examines a single dependent variable.

V

Validity The extent to which a test measures the variable it is designed to measure.

VariabilityHow much the values for a given variable are spread over a given range.

VariableA trait or characteristic whose value is not fixed and can change (either from subject to subject, or within the same subject over time).

VarianceA measure of variability; the average difference between the data values and mean of a data set.

VariationA way of showing how data are spread out.

W

Wilcoxon rank-sum test See MannμWhitney test.

Wilcoxon signed-rank testA nonparametric test used to compare paired data from the same population.

Windows and general-purpose menusIn SPSS, the File, Edit, View, Utilities, and Help menus.

Z

z-score/standardized score A measure of how far above (positive value) or below (negative value) a score falls, as compared with the mean.