Multivariable Analysis

A. Introduction and Definitions

Mutlivariable Analysis
1. Tool for determining unique contributions of various factors to single event or outcome
2. Factors which can contribute to outcome are called risk factors or independent variables
3. Useful for identifying and/or eliminating confounding effects, or confounders
Confounding
1. Apparent association between risk factor and outcome affected by a third variable
2. This third variable is called a confounder
3. Confounder must be associated with the risk factor and causally related to outcome
Clinical Efficacy of Therapies [3]
1. Experimental Event Rate (EER) and Placebo Control Event Rate (CER) are measured
2. Relative Risk Reduction (RRR) = {EER-CER}/CER is effect of intervention relative to control rate ({} is absolute value)
3. Absolute Risk Reduction (ARR) = {EER-CER} for interventions which reduce bad outcomes
4. Absolute Benefit Increase (ABI) = {EER-CER} for interventions which show benefits
5. Number Needed to Treat (NNT) to have one (1) additional good outcome = 1/ABI
6. In general, ARR and NNT are the most helpful in assessing the clinical efficacy
P (Probability) Value [2]
1. Indicates the percentage likelihood that the results observed are due to chance alone
2. By convention, a P value of 0.05 or less is considered significant
3. Significant means the result is unlikely to be due to chance alone
4. P values are affected by sample size and magnitude of effect
5. If a sample is very small, the P value may be >0.05 although the magnitude of the effect is large
6. If a sample is very large, the P value may <0.05 although the magnitude of the effect is small
7. Caution must be exercised in interpretation of P values without consideration for scientific and medical issues relevant to a study [2]
Confidence Intervals (CI)
1. Usually a 95% confidence interval, but can be set at higher or lower levels
2. Means that a range of values within which the result will fall 95 times if an experiment if repeated100 times
3. Alternately, a range surrounding the result, within which the true result lies with 95% probability
4. CI are affected by both the sample size and the magnitude of the effect
Kappa Statistic
1. Measures agreement beyond that due to chance alone
2. Value <0.40 considered fair-to-slight agreement
3. Maximal value is 1.0

B. Types of Multivariable Analysis

Multiple Linear Regression
1. Interval outcome
2. Variable coefficients have a linear relationship with outcome
3. Thus, coefficients "weighing" contribution of each variable (risk) between 0 and 1
4. Used with continuous or interval outcomes such as blood pressure
5. Equally sized differences on all parts of scale are equal
Multiple Logistic Regression
1. Dichotomous (yes/no) outcome
2. Model constrains probability of outcome to 0 or 1
3. The antilogarithm of the coefficients equal the odds ratio
Proportional Hazards (Cox) Regression
1. Length of time to discrete event
2. For longitudinal studies in which persons may be lost to follow-up
3. Often used in prevention studies, in diseases with high mortality
4. The antilogarithm of the coefficients equal the relative hazard
Time variable covariates can be introduced in circumstances where contributions of variables change over time

C. Risk Ratios

Explanation in setting of example of a drug's impact on an outcome:
1. Drug Group: A responders, B non-responders, A+B total patients treated with drug
2. Placebo Group: C responders, D non-responders, C+D total patients in placebo group
Relative Risk (RR, Relative Hazard)
1. Probability that a person experiences an outcome in a short time interval
2. Given that the person has survived to the beginning of the interval
3. In example, RR=A/(A+B) ÷ C/(C+D) = Ax(C+D)/((A+B)xC)
The odds ratio is ratio of the odds of responders in each group
1. Odds of response in drug group = A/B
2. Odds of response in placebo group = C/D
3. Odds Ratio = A/B ÷ C/D = (AxD)/(BxC)
The response rate attributable to the drug in this example is A/(A+B) - (C/(C+D)) = AAR
When the outcome (response) is uncommon (<15%), the odds ratio and RR are similar
1. This is because for B>>A and D>>C, RR collapses to AxD/(CxB)
2. When outcome is comon, odds ratio does not approximate the RR
Diagnostic Test results can be treated in the same way
1. Disease Group: A have positive test, B have negative test
2. Non-Disease Group: C have positive test, D have negative test
Sensitivity of Test (SEN)
1. Ability of a test to detect the disease
2. SEN = Test Positive with Disease/Total Disease = A/(A+B)
Specificity of Test (SPE)
1. Ability of a test to rule out the disease
2. SPE=Test Negative without Disease/No Disease = D/(C+D)
Likelihood Ratios (LR) [4]
1. LR is ratio of probability of test result among patients with target disorder to probability of that same test result among patients without the disorder
2. Positive LR (LR for positive test) is calculated as sensitivity/(1-specificity)
3. Negative LR (LR for neative test) is calculated as (1-sensitivity)/specificity
4. Positive LR has also been called the Bayes Factor and is the same as the RR
Probabilities and Odds Ratios
1. Pretest probability is prevalance of disease: proportion of patients who have the target disorder before the test is carried out
2. Post-test probability is proportion of patients with that particular test result who have the target disorder
3. Pretest odds: odds that the patient has the target disorder before test is carried out, calculated as pretest probability/(1-pretest probability)
4. Post-test odds: odds that the patient has the target disorder after the test is carried out, calculated as pretest odds x LR

D. Evaluation of Multivariable Models

Residual Analysis
1. Best way to assess whether model fits data
2. Residuals are differences between observed and estimated values
3. Appxoimately the errors in estimation
4. R-squared (R2) values reported for linear regression models
5. R2 values close to 1 are excellent
Are (all) correct variables in the model
Are models to be explanatory or predictive
1. Related to which variables present
2. Related to whether all or most variables known or unknown
Are models reliable
1. Was sample size to generate model sufficient
2. Did sample represent populations that will be studied in future

References

Katz MH. 2003. Ann Intern Med. 138(8):644
Goodman SN. 1999. Ann Intern Med. 130(12):995
Sackett DL and Haynes RB. 1997. ACP Journal Club. 127(1):A15
Goodman SN. 1999. Ann Intern Med. 130(12):1005

section name header

Info