Chi-Square

Is There a Difference?

By the end of this chapter, students will be able to:

Identify conditions under which the chi-square test is appropriate.
Identify the question that the chi-square test is designed to answer.
Formulate a null and an alternative hypothesis.
Formulate a 2 × 2 table from an existing data set.
Interpret a statistical computing program printout of a chi-square test, determine what action to take with regard to the null hypothesis, and justify this decision in statistically correct terminology.
Identify a current research article that uses a chi-square test, determine the level of measurement of the variables used and whether the results are statistically significant, and utilize this information to draw a statistical conclusion.
Debate whether clinical recommendations should be made from a research article’s conclusion, and prepare a public health report using this information.

Chi-square (X²) A test used with independent samples of nominal- or ordinal-level data.
Degrees of freedom (df) The number values that are “free to be unknown.”
Null hypothesis No relationship, association, or difference exists between the variables of interest.

Recall that in most experiments, the null hypothesis means that no relationship, association, or difference exists between the study groups or samples. So how can we test to see whether there really is no difference? That’s what we will discuss here: an actual test to see whether there is a statistically significant difference.

In this chapter, we will talk about the chi-square (X²) test, which is appropriate when working with independent samples and an outcome or dependent variable that is nominal- or ordinal-level data. You already know that nominal-level data tell you that there is a difference in the quality of a variable, whereas ordinal-level data have a rank order so that one level is greater than or less than another.

For example, suppose you are an operating room nurse. You want to see whether there is a difference in the need for postoperative transfusion for patients who have surgery in the morning and patients who have surgery in the afternoon. Your independent variable identifies the groups you want to compare, a sample of morning patients and a sample of afternoon patients. Your outcome or dependent variable is postoperative transfusion, which is measured at a nominal level (yes or no). Now you want to see whether the frequencies you observe differ from the frequencies you would expect if the variables were independent or not related.

Let’s formulate a null hypothesis and an alternative hypothesis using the standard notation, which looks like this:

H₀: This is how statisticians indicate the null hypothesis.
H₁: This notation indicates the alternative hypothesis.

Here are our hypotheses:

H₀: There is no difference in the need for a postoperative transfusion among patients who have surgery in the morning and patients who have surgery in the afternoon.
H₁: The need for a postoperative transfusion is different for patients who have surgery in the morning and patients who have surgery in the afternoon.

Your next step is to set up another 2 × 2 table. Statisticians love these! See Table 8-1.

Table 8-1: Surgery Time and Postoperative Transfusion Status

	Morning Patients	Afternoon Patients	Total
Transfused	20 (A)	40 (B)	60 (A + B)
Not Transfused	30 (C)	10 (D)	40 (C + D)
Total	50 (A + C)	50 (B + D)	100 (A + B + C + D)

Before you determine statistical significance, you need to determine the degrees of freedom (df), which refers to the number values that are “free to be unknown” once the row and column totals are in a 2 × 2 contingency table. With a chi-square test, the degrees of freedom is equal to the number of rows minus one times the number of columns minus one:

df = (2 - 1) × (2 - 1)

df = 1 × 1 = 1

All 2 × 2 tables have 1 degree of freedom. In other words, once you know the row and column totals and one additional cell value in the table, you can figure out the rest of the cell values in the table, and they do not change unless the original cell value changes. This is why there is only one value that is “free to be unknown.”

Once you put your data into a statistical program, it will compute the expected values for each cell, assuming that the two variables are independent. You will then need to apply the chi-square test to see whether the observed values are significantly different from the expected values at 1 degree of freedom.

If the X² result has a significant p-value (usually < 0.05, depending on the alpha you use), then you reject the null hypothesis that the two variables are independent. You conclude there is an association between the time of surgery and the need for postoperative transfusions. Postoperative transfusion rates are significantly different for patients who have surgery in the morning and patients who have surgery in the afternoon.
If the X² result has a p-value greater than the alpha you selected (e.g., if your alpha is 0.05 and the p-value is 0.09), the result is not statistically significant. You fail to reject the null hypothesis. In this case, you conclude that postoperative transfusion rates are not significantly different for patients who have surgery in the morning and patients who have surgery in the afternoon. Remember that this may be because there really isn’t a difference in postoperative transfusion rates depending on the time of surgery or because of another reason, such as your sample size being too small.

Also, note that the chi-square test doesn’t tell you the direction of the relationship or difference. Suppose the p-value for your X² is significant. In that case, all you know is that you can reject the null and that there is a statistically significant difference in your outcome variable between the two samples. As the statistical wizard you are, you then look again at your data to determine that difference. For instance, if you had a statistically significant X² in the example about surgery time and postoperative transfusions, you could go back and look at which group had more transfusions. In this sample, a larger portion of the afternoon patients needed transfusions. Given your statistically significant result, you could conclude that, for this sample, patients who had surgery in the afternoon were more likely to need a postoperative transfusion than patients who had surgery in the morning.

From the Statistician

Brendan Heavey

Pearson’s Chi-Square Test for Association

The chi-square test is one of the simplest tests available in a subset of statistics called categorical data analysis. The majority of theories relating to categorical data analysis started to be developed around the turn of the 20th century. Karl Pearson (1857-1936) was a very important statistician from England who is responsible for first developing the chi-square distribution. Pearson was an arrogant man who frequently butted heads with colleagues. He specifically argued about the intelligence and merits of a young statistician named R. A. Fisher (1890-1962), who has since become established as one of the most important scientists of all time. The two men argued over many different things. Fisher was concerned about what would happen to Pearson’s chi-square statistic when sample sizes were extremely low, and Pearson didn’t think it was a problem.

Fisher contributed a number of fundamentals of statistical science, and he developed the Fisher exact test. This test is now commonly used in place of Pearson’s chi-square test when the sample size of any cell in the data is less than 5 (because, in the end, Fisher’s arguments with Pearson proved correct). Fisher also used properties derived from Pearson’s chi-squared distribution to show that Gregor Mendel, the eminent geneticist and Augustinian priest who theorized about the inheritance of genetic traits using peas, most likely derived many of his theories based on fabricated data. (Fisher remained convinced that one of Mendel’s assistants was responsible for the fabrication; Mendel is still considered a very gifted geneticist.) In any event, the two tests—Pearson’s chi-square test and Fisher’s exact test—are now very common statistics to use in clinical trials and scientific research. Specifically, their use is very popular when a researcher wants to test whether a new treatment or therapy is better than the so-called gold standard already in use.

The null hypothesis that is tested in both these tests is as follows:

H₀: The proportions being compared are equal in the population.

Here is a motivating example that’s slightly more difficult than the one in the main text. This time we’ll use a variable that has more than just two categories.

Suppose we want to study the relationship between a partner’s occupation and a spouse’s marital happiness in a clinically relevant population. Our null hypothesis is as follows:

H₀: Partner’s occupation is not related to spouse’s marital happiness in this subset of occupations in the population we sampled.

To conduct the study, we enroll 1868 spouses and ask them to report their partner’s occupation and to rate the happiness of their marriage on the following 4-point scale:

Very happy
Pretty happy
Happy
Not too happy

The results are shown in Table 8-2. Now, if you plug these values into any statistical program (or use the From the Statistician: Methods section from this chapter to hand-calculate the values), you’ll see that the p-value for the difference between the happiness of the spouses of statisticians versus those of supermodels is so low that it is estimated at 0. So now you know that there is an association between the partner’s occupation and the spouse’s marital happiness (because your p-value is less than alpha, meaning it is significant). So you reject the null hypothesis that there isn’t a relationship between a partner’s occupation and a spouse’s marital happiness.

Table 8-2: Partner’s Occupation and Spouse’s Report of Marital Happiness

	Partner’s Occupation
Spouse’s Marital Happiness	Statistician	Supermodel	Total
Very happy	800	25	825
Pretty happy	706	10	716
Happy	200	25	225
Not too happy	2	100	102
Total	1,708	160	1,868

But what else might you like to know? Let’s say that a friend is dating a statistician and a supermodel and that they both intend to propose this evening. What advice would you give your friend? Which might be the better choice to ensure long-term marital satisfaction? Remember that the chi-square test tells you only that there is a better choice, not which one it is. But you remember from your college stats class that you can determine which is better by looking at the data. The proportion of spouses married to statisticians who reported being happy or higher was 99.9% (1706 ÷ 1708), whereas the proportion of spouses married to supermodels who reported being happy or higher was 37.5% (60 ÷ 160). Assuming that your friend wants to get married in the first place, which proposal should your friend accept? (Hint: In the statistics books, the statisticians always live happily ever after!)

Note: All the anecdotal information regarding Spearman and Fisher comes from Agresti’s (2002) landmark book on the subject, Categorical Data Analysis.

You might be inclined to use a chi-square test anytime you have an outcome or dependent variable at the nominal level, but the test wouldn’t always be a good choice. This is because the chi-square test includes some additional assumptions (in addition to requiring a nominal-level outcome or dependent variable), which must be met for the test to be used appropriately.

All cells within the 2 × 2 table must have an expected value greater than or equal to 5. If at least one cell in your 2 × 2 table has an expected value of less than 5, you should use the Fisher exact test instead. You should also note that if any of the cells in the frequency table have greater than 5 but fewer than 10 expected observations, you can still use the chi-square test, but you need to do a Yates continuity correction as well. The really nice thing in this day and age is that many statistical programs automatically make this correction when this condition occurs, saving you the time and trouble of doing it manually. You might want to look for it on your next Intellectus statistics printout.

The sample should be random and independent. Here’s an example of a violation of this assumption: Your study involved measuring the need for postoperative transfusion among brothers and sisters who underwent a particular procedure. (Because these subjects are related to each other, they are not independent—once you included the brother in the study, the sister was included as well, so her participation was “dependent” on her brother’s selection.) In this case, the sample is not independent and random. Instead, the sample is now matched or paired, and the McNemar test is the correct choice to use for the analysis. (Both the Fisher and the McNemar tests are based on the same idea as the chi-square, but they have mathematical adjustments to accommodate the violation of the assumptions of the chi-square test.)

Where Students Often Make Mistakes

Our discussion has included independent and dependent variables. We are now also talking about independent and dependent samples, which sometimes confuses students. Independent and dependent are just the adjectives describing the attribute of whatever noun they are modifying.

When we talk about variables, the independent variable is the variable the researcher thinks affects or is associated with the outcome. The dependent variable is the outcome variable or result. For example, in a study comparing chemotherapy agents and a subsequent peripheral neuropathy score, the chemotherapy agent is the independent variable, and the peripheral neuropathy score is the dependent variable.
Independent samples are unrelated groups a researcher compares. With dependent samples, there is a relationship between the members of the samples the researcher compares. For example, the person who ends up in group B is determined by who we selected for group A. The classic example of dependent samples is a study with a pre- and posttest design. In that situation, the group that takes the pretest is the same group that takes the posttest. A subject in the posttest sample is entirely dependent on whether the subject was in the pretest group. These are related or dependent samples.

Make sure you don’t stop at independent and dependent. Consider what attributes they are describing to understand a question or article.

Thinking It Through

Looking for a Difference in Two Samples When the Outcome Variable Is Nominal or Ordinal

Sample Type	Level of Dependent or Outcome Variable	Test	Research Example
Two independent samples	Nominal or ordinal	Chi-square	Is there a difference in the level of nursing education achieved for married and divorced nurses? (The independent variable is nominal, married/divorced; the dependent variable is ordinal, with four levels: AD, BS, MS, and DNP.)
Two dependent samples	Nominal or ordinal	McNemar’s test	In motor vehicle accidents involving a passenger and a driver, is the driver or the passenger more likely to experience a head injury? (Driver and passenger are related because they are in the same car. This independent variable creates two dependent samples to compare; the dependent variable, head injury, is yes or no, so it is at the nominal level.)

Note: The levels of the independent variable can create samples or groups. For example, marital status may be your independent variable. The groups you are interested in comparing are married people and people who are divorced, creating two samples or groups to compare.)

From the Statistician

Brendan Heavey

Methods: Calculating Pearson’s Chi-Square Test by Hand

Table 8-3 is a repeat of Table 8-2, for easier reference.

The first step is to calculate the expected frequency from each cell with the following formula:

The results are shown in Table 8-4.

Now compute the statistic:

Table 8-3: Partner’s Occupation and Spouse’s Report of Marital Happiness

	Partner’s Occupation
Spouse’s Marital Happiness	Statistician	Supermodel	Total
Very happy	800	25	825
Pretty happy	706	10	716
Happy	200	25	225
Not too happy	2	100	102
Total	1,708	160	1,868

Table 8-4: Expected Frequencies for Partner’s Occupation and Spouse’s Marital Satisfaction

	Expected Frequencies
	Statistician	Supermodel
Very happy	(825 × 1708) ÷ 1868 = 754.34	(825 × 160) ÷ 1868 = 70.66
Pretty happy	(716 × 1708) ÷ 1868 = 654.67	(716 × 160) ÷ 1868 = 61.33
Happy	(225 × 1708) ÷ 1868 = 205.73	(225 ×160) ÷ 1868 = 19.27
Not too happy	(102 × 1708) ÷ 1868 = 93.26	(102 × 160) ÷ 1868 = 8.74

The big sigma character (Σ) just means to sum everything over all the cells; in our case, we calculate:

We then apply the formula for calculating the degrees of freedom for a chi-square test:

(Number of rows - 1)(Number of columns - 1)

In this case, the degrees of freedom are:

3 × 1 = 3

We can then look up the p-value for this test statistic from a table of the chi-square distribution with degrees of freedom:

p< 0.0001

There are two main points to review in this chapter. First, you should understand the concept of the null hypothesis. The null hypothesis means that no relationship, association, or difference exists between the variables of interest. Second, the chi-square test is used to look for a statistically significant difference or relationship when you have a nominal- or ordinal-level dependent or outcome variable.

If the chi-square test result has a p-value that is significant (less than 0.05 or whatever alpha you use), then you reject the null hypothesis.
If the chi-square test result is not statistically significant (greater than 0.05 or the alpha of choice), then you fail to reject the null hypothesis.

Don’t forget to use your decision tree! See Figure 8-1.

Figure 8-1:

Last, the chi-square test does not tell you the direction of the relationship; only you can make that interpretation.

That wraps up this chapter. Not too bad, right?

Questions 1-11: A study is completed to examine the relationship between gender identification and sports participation. It is conducted by randomly surveying ninth graders at Smith High School. The collected data are shown in Table 8-5.

Table 8-5: Gender and Sports Participation Among Ninth-Grade Students

	Male	Not male	Total
No sports	30	50	80
Sports participation	70	50	120
Total	100	100	200

What level of measurement is gender? In this example, is it continuous or categorical?
Show Answer
What level of measurement is sports participation? Is it qualitative or quantitative?
What measure of central tendency can you determine for sports participation? What is the measure of central tendency for males only? Is the measure of central tendency different for the whole sample?
Show Answer
If the whole school has 800 students and the ninth grade has 250 students, what percentage of the ninth-grade population did you sample?
Write an appropriate null hypothesis for this study.
Show Answer
Write two alternative hypotheses that correspond to your null hypothesis.
Calculate the chi-square from the 2 × 2 table in Table 8-5. The p-value is < 0.005. Is sports participation significantly different for males and not males in this sample? (See the “From the Statistician: Methods” calculation.)
Show Answer
What should you conclude about your null hypothesis?
What type of error might you be making?
Show Answer
If you wanted to make the chance of this type of error smaller, what could you do?
Why is the chi-square test appropriate for this study?
Show Answer

Questions 12-14: After the school instituted a new aerobics program, data were gathered in a follow-up survey administered in all the grades. The collected information is shown in Table 8-6.

Table 8-6: Gender and Sports Participation After New Aerobics Program

	Male	Not Male	Total
No sports	60	20	80
Sports participation	140	180	320
Total	200	200	400

Chi-square = 25, p< 0.0001.

If the entire school has a population of 800, what percentage of the students is included in your sample?
Is gender related to sports participation in this follow-up survey? If so, who is more likely to participate? How many degrees of freedom do you have?
Show Answer
Imagine you are the editor of the journal in which an article was submitted for review using a chi-square test to determine whether males are more likely to participate in sports. After reading it, you realize that the subjects were recruited as sibling pairs. What would you conclude about the analysis?
You are working in a school-based health center and have developed a new screening tool for suicide risk among adolescent athletes. The pilot of your new screening tool reports that older athletes are more likely to attempt suicide than younger athletes. These results agree with other published reports on the general adolescent population. This helps establish what type of validity for your new screening tool?
Show Answer
You administer your new screening tool to athletes in your school health clinic but find the results confusing. After reviewing your screening tool, you realize that the following mistake was made: When the survey was administered to three of the sports teams, it was printed on only one side of the paper and should have been copied onto both sides. As a result, half of the survey was missing when it was administered to these three teams. Are the data collected from your screen reliable? What does this tell you about the validity of the screening tool in this situation?

Questions 17-22: Your study is trying to establish whether there is a relationship between the screen results you have obtained and suicide attempts.

After 1 year of follow-up, you get the results shown in Table 8-7. Explain what each box means in plain English.
Table 8-7: Suicide Risk and Screening Results
Attempted Suicide No Suicide Attempt Total
Screen positive 20 5 25
Screen negative 10 215 225
Total 30 220 250
Show Answer
What is the sensitivity of your screen? What does this mean in plain English?
What is the specificity of your screen? What does this mean in plain English?
Show Answer
What is the positive predictive value (PPV) of your screen? What does this mean in plain English?
What is the negative predictive value (NPV) of your screen? What does this mean in plain English?
Show Answer
What is the prevalence of suicide attempts in this sample?
“From the Statistician” review question: Would you do the analysis differently if no spouses were happily married to partners who were supermodels in the “From the Statistician” feature in this chapter?
Show Answer

	Attempted Suicide	No Suicide Attempt	Total
Screen positive	20	5	25
Screen negative	10	215	225
Total	30	220	250

Questions 24-33: In a random sample of 100 patients with biopsy-confirmed breast cancer, a study examines cancer-detection rates with 50 previously collected two-dimensional (2D) mammograms compared to detection rates in 50 previously collected three-dimensional (3D) mammograms. The alpha selected for this pilot study is 0.10, and the power is 0.80.

Write a null and an alternative hypothesis for this study.
What is the independent variable? At what level of measurement is this variable?
Show Answer
What is the dependent variable?
If cancer detection is measured as yes or no, at what level of measurement is this variable?
Show Answer
The pilot study reports a chi-square of 2.46. Is there a significant difference between cancer-detection rates with these two screening mechanisms?
The study reports that 2D mammograms detected 70% of cancers and 3D mammograms detected 90% of cancers, with a p = 0.12. What decision should the researcher make about the null hypothesis?
Show Answer
In a larger study with the same parameters, 2D mammograms detected 75% of cancers and 3D mammograms detected 91% of cancers, with a p = 0.01. What decision should the researcher make about the null hypothesis?
Knowing the results of the larger study should make the researcher wonder if the conclusion of the smaller pilot study was what type of error?
Show Answer
Additional studies show that the sensitivity of the 3D mammogram is 94% and the PPV is 98%. You have a patient with a positive 3D mammogram, indicating a high risk of cancer. She wants to know what the chances are that she actually has cancer. What can you tell her?
Instead, the study design involved looking at 100 subjects who had both a 2D and a 3D mammogram to determine which screen had higher detection rates. Would a chi-square test be appropriate? Why or why not?
Show Answer

Questions 34-39: In a random sample of patients with oropharyngeal cancer, the researcher wishes to determine if there is a relationship between a history of vaping (yes/no) and the type of oropharyngeal cancer. The study has an alpha of 0.05 and a p-value of p = 0.01.

What is the dependent variable?
If the type of oropharyngeal cancer is recorded as oral cavity/pharynx, tongue, mouth, pharynx, and other, at what level of measurement is this variable?
Show Answer
What would be an appropriate test for testing the null hypothesis? Explain.
What decision should be made about the null hypothesis? Explain.
Show Answer
If this decision is not correct, what potential error could it be?
The presence of leukoplakia or erythroplakia in the oropharynx for more than 2 weeks is associated with oropharyngeal cancer. A new tool is developed to screen for oropharyngeal cancer in patients with these symptoms, and it has an NPV of 84%. Your patient’s screen is negative, and he wants to know what this means. Explain his result in plain language.
Show Answer

Questions 40-47: Pruritis is a common response to spinal morphine in patients who have a cesarean section. You conduct a study examining the risk of pruritis in women who are having a planned cesarean section in a large health system. You sort those who are eligible to participate into two groups to ensure that each group has 20% of the sample under age 25 and 10% of the sample over age 35. One group receives a low dose of spinal morphine, and the other group receives a moderate dose of spinal morphine. Your team records pruritis within 24 hours as yes/no. Your study results include a p-value of 0.04.

What type of sample is this? Is it probability or nonprobability sampling?
What level of measurement is your independent variable?
Show Answer
What level of measurement is your dependent variable?
Do you have independent or dependent samples to compare?
Show Answer
Is there a statistically significant relationship between dose and pruritis?
Which dose is associated with more pruritis?
Show Answer
Why is a chi-square an appropriate test for this hypothesis?
If your study conclusion was not correct, what type of error would it be?
Show Answer

Questions 48-53: You develop a screen that you think will help identify patients at risk of developing pruritus after receiving spinal morphine. After administering the screen to 814 women who have a planned cesarean section scheduled, you find the following results:

	Pruritus	No Pruritus	Total
Screen positive	98	12	110
Screen negative	53	651	704
Total	151	663	814

One of your subjects screens positive and wants to know the probability of developing pruritus. What do you tell her?
If a patient is going to develop pruritus, what is the probability that your screen will detect it?
Show Answer
What is the efficiency of your screen?
If a patient is not going to develop pruritus, what is the probability that your screen will be negative?
Show Answer
What is the prevalence rate of the disease in this sample?
Your 79-year-old father is really proud of your research. He is having knee surgery and wants you to tell him the probability that he will develop pruritus after receiving spinal morphine. What do you tell him?
Show Answer

What Went Wrong?

You are asked to review a proposal for a grant-funded project. When you read the following, you immediately identify a problem. What is it?

The researchers plan to examine the relationship between shift (days/nights) and end-of-shift fatigue scores (0-100). They will randomly select five hospitals in the state and then randomly select 100 nurses to sample at each hospital. They will utilize an alpha of 0.10 for this pilot study. The analysis of how the shift affects the average fatigue score will be computed with a chi-square test, and subsequent p-values will be reported.

Research Application Article

It is really helpful to start reading professional research articles to see how these concepts are used in actual study scenarios. So give it a try with this article. This study is a nurse-led initiative to help prevent the late detection of patient deterioration using continuous physiological monitoring. Early detection of patient deterioration can help prevent poor patient outcomes and extended length of stay.

Stellpflug, C., Pierson, L., Roloff, D., Mosman, E., Gross, T., Marsh, S., Willis, V., and Gabrielson, D. (2021). Continuous physiological monitoring improves patient outcomes. American Journal of Nursing, 121(4), 40-46.

Why did the clinical nurse specialists bring together the interdisciplinary team in 2017?
Show Answer
What is the null hypothesis in their 2017 quality improvement (QI) initiative?
What is the alternative hypothesis in their 2017 QI initiative?
Show Answer
When the alarm sounded, in addition to checking the patient and ensuring the device was correctly connected, the nurse would obtain a manual reading of the patient’s vital signs. Doing so helped to establish what aspect of the measurement?
If the QI champions simply had removed the monitoring equipment and excluded the patients wearing the defective equipment from the study instead of troubleshooting issues, what might have occurred?
Show Answer
What were the outcome or dependent variables in the QI project?
The researchers compared pre-continuous monitoring rapid response team (RRT) rates from 2016 to continuous monitoring RRT rates in 2018. What did they find? Was it significant?
Show Answer
The researchers compared 2016 pre-continuous monitoring rates of transfer to the intensive care unit (ICU) after RRT to the same rates after instituting continuous monitoring in 2018. What did they find? Was it significant?
If the results comparing the 2016 and 2018 rates were actually an error, what type of error would it be? What would be the most likely explanation for why it may have occurred?
Show Answer
When the clinical nurse specialists looked back on those who had a rapid response or code team activated, what did they notice about patients for whom this was related to respiratory failure? What did they conclude, and why is this crucial information to have gathered from the data?
What is the financial impact associated with implementing this intervention?
Show Answer
What type of sample was this study? Identify one way this could affect generalizability.

Computer Applications Using Statistical Software for Nonstatisticians

Short How-To Videos for Intellectus Statistics Applications (all available at https://www.intellectusstatistics.com/how-to-videos/):

Chi-Square Test of Independence

Chi-Square Distributions and Chi-Square Test of Independence

Chi-Square Test of Independence (2)

Data Analysis Application:

Open your Kidney Data Set project.

Write an appropriate null and alternative hypothesis for the relationship between Gender and chronic kidney disease (CKD) status (high/low).
Test your null hypothesis using a chi-square test (because CKD status now has only two categories, we can analyze it as nominal or ordinal). (Note: The chi-square test for independence is under “Analyses and nonparametric tests.”)
How many women had high CKD? How many women had low CKD?
Men with high CKD made up what percentage of your sample?
Review your test statistic and p-value. What would you conclude about your null hypothesis? Explain your answer.

1. Nominal, categorical

3. Nominal, mode, mode = participating in sports for males and for the total sample

5. H₀: There is no relationship between gender and sports participation.

7. See Table 8-8.

Table 8-8: Expected Values for Gender and Sports Participation

	Male	Not Male
No sports	(80 × 100) ÷ 200 = 40	(80 × 100) ÷ 200 = 40
Sports participation	(120 × 100) ÷ 200 = 60	(120 × 100) ÷ 200 = 60

If your alpha is 0.05, then yes, sports participation is significantly different for males and females. This conclusion is because p is significant.

9. Type I

11. The outcome variable is nominal/ordinal. It is an independent sample, and the cell values are all >5.

13. Yes, females are more likely.

df = (Number of rows - 1) × (Number of columns - 1) = 1 × 1 = 1

15. Convergent

17. 20 = true positives, 5 = false positives, 10 = false negatives, 215 = true negatives

19. 215/220 = 98%. If the subject does not have the disease, there is a 98% chance the screen will be negative. A specific screen is good at identifying those without the disease.

21. 215/225 = 96%. If the screening test is negative, it is probable that the subject does not have the disease.

23. You would need to use Fisher’s exact test because of the small cell size.

25. Type of mammogram, nominal

27. Nominal

29. Fail to reject the null, p > alpha

31. Type II

33. No, these would be dependent samples and would require McNemar’s test. Chi-square must have independent samples.

35. Nominal

37. Reject the null, p< alpha

39. Because his screen is negative, we know there is an 84% chance he does not have oropharyngeal cancer.

41. Ordinal

43. Independent

45. Unable to determine—there is a significant difference, but you need more information to determine where.

47. Type I

49. Sensitivity = 98/151 = 65%

51. Specificity = 651/663 = 98.2%

53. We will have to find another study. Mine was a nonrandomized study of pregnant people. Generalizing these results to your situation would not be advisable, but I love that you are so interested in my work! Thanks, Dad!

Research Application Article Answers

1. The interventions implemented in 2016 improved critical metrics but still left gaps between the vital sign checks every 4 to 8 hours, in which patient deterioration was missed.

3. The use of continuous monitoring is not associated with improved patient surveillance and early identification of patient deterioration.

5. Sampling bias could be introduced. By eliminating the subjects where there was an issue with the equipment, the results may be skewed toward showing a greater effect than what was actually there while also masking a limitation of the impact of poorly functioning equipment.

7. A 53% decline (55/572 to 26/547), Chi-square = 10.3931, p< 0.006. Yes, it was significant; reject the null.

9. A type II error, likely due to inadequate sample size to detect the effect size

11. Unable to determine because of the short time frame of the intervention period and the limited number of patient transfers to the ICU, but trends positive for financial savings

Answers to Data Analysis Application questions can be found in the Instructor Resource package accompanying this text.

section name header

Objectives ⬇

Key Terms ⬆ ⬇

Chi-Square (X2) Test ⬆ ⬇

The Null and Alternative Hypotheses ⬆ ⬇

2 chr(215) 2 Table ⬆ ⬇

Table 8-1: Surgery Time and Postoperative Transfusion Status

Degrees of Freedom ⬆ ⬇

Statistical Significance ⬆ ⬇

Direction of the Relationship ⬆ ⬇

From the Statistician

Brendan Heavey

Pearson’s Chi-Square Test for Association

Table 8-2: Partner’s Occupation and Spouse’s Report of Marital Happiness

When Not to Use Chi-Square: Assumptions and Special Cases ⬆ ⬇

Where Students Often Make Mistakes

Thinking It Through

Looking for a Difference in Two Samples When the Outcome Variable Is Nominal or Ordinal

From the Statistician

Brendan Heavey

Methods: Calculating Pearson’s Chi-Square Test by Hand

Table 8-3: Partner’s Occupation and Spouse’s Report of Marital Happiness

Table 8-4: Expected Frequencies for Partner’s Occupation and Spouse’s Marital Satisfaction

Figure 8-1:

Review Questions ⬆ ⬇

Table 8-5: Gender and Sports Participation Among Ninth-Grade Students

Table 8-6: Gender and Sports Participation After New Aerobics Program

Table 8-7: Suicide Risk and Screening Results

What Went Wrong?

Research Application Article

Computer Applications Using Statistical Software for Nonstatisticians

Short How-To Videos for Intellectus Statistics Applications (all available at https://www.intellectusstatistics.com/how-to-videos/):

Data Analysis Application:

Answers to Odd-Numbered Review Questions ⬆

Table 8-8: Expected Values for Gender and Sports Participation

Research Application Article Answers