Principles of Epidemiology
Philippines Pneumonia Study
Children in Bohol province in the Philippines were recruited into a longitudinal study of pneumonia incidence, over the period July 2001-December 2004. Recruitment occurred at attendance for their first vaccination at a health clinic in the recruitment catchment area, and children were followed until their second birthday or loss to follow-up. An outcome of interest was the occurrence of pneumonia, the most common cause of mortality in children in this area. The diagnostic definition of this was based on WHO criteria, itself based on clinical signs such as cough, fast breathing and chest indrawing.
The data are a sample of the 12,000 children recruited into the study.
CODEBOOK
Variable
Description
Code
PID Child ID Sequential Number from 1
Sex Sex of child Male
Female
N_Children Number of children in the household Number
N_Siblings Number of live siblings of the child Number
Mother_Age Mother’s age at birth of child In years
Mother_Education Mother’s education Primary or High School
College
Mother_Employed Mother’s occupation Emp
Unemp
Birth_Place Place of birth Home
Health Facility
Birth_Attendant Birth Attendant Physician Midwife
Other
Vac_AgeMths Age at child’s first vaccination In completed months
Weight_kg Weight at first vaccination In kg
Weight_Group Weight group at first vaccination < 4.5 kg
4.5 – 5.4 kg
> = 5.5 kg
Vac_Season Season of first vaccination Winter: Months 12,1,2
Spring: Months 3,4,5
Summer: Months 6,7,8
Autumn: Months 9,10,11
Pneumonia Child experiences an event of WHO-defined pneumonia in his/her follow-up period (up to second birthday) 0=No
1=Yes
The research questions to be answered are as follows:
1. Is there an association between level of mother’s education and the incidence of pneumonia?
2. If yes, what is the strength of this association?
3. Is there an association between maternal unemployment and the incidence of pneumonia? What is the strength of this association?
4. As discussed in the lecture, stratified analysis is one method for assessing the potential for confounding. Conduct the following stratified analyses to test for confounders:
a. whether ‘number of children in the household’ is likely to confound the observed association between mothers education level and incidence of pneumonia.
b. Whether mother’s education level is likely to confound the observed association between maternal unemployment and incidence of pneumonia.
5. Discuss your findings and interpretations.
Materials
1. This task sheet
2. Data worksheet titled Pneumonia .xls
3. Supplement 1: Creating pivot tables in Excel.
4. Supplement 2: 95% CI and Adjusted RR
STEP 1
1. To assess the presence of an association between education and incidence of pneumonia, consider mothers with Primary / High school education as ‘exposed’; and mothers with College education as ‘unexposed’.
2. Using the data from the Pneumonia worksheet, set up a 2 x 2 table to calculate the incidence of pneumonia in exposed and non –exposed groups.
HINT: To extract the data for the 2 X 2 table, you could use the ‘Pivot table’ function in excel, following the instructions provided in Supplement 1. Remember to set it up according to the configuration below:
Mothers education Pneumonia
Yes No
Primary / High school a b
College c d
3. Once you get the data for the 2 X 2 table from the pivot function, copy and paste this data into a new worksheet; and label it Relative Risk Education.
4. Use this data to calculate the relative risk of pneumonia in exposed groups (RR = IE / Io )
STEP 2
1. Calculate the standard error of this risk, and thereafter, the 95% confidence interval for this measure.
HINT: Follow the instructions for this calculation in Supplement 2
2. Create a text box below the calculations, and record your interpretation of the strength of the association as a percentage of increased risk; and state whether this increased risk is significant.
STEP 3
1. Repeat steps 1 & 2, this time to test the presence and strength of any association between maternal unemployment and incidence of pneumonia. Set up your 2 X 2 table with the following configuration:
Mothers employment Pneumonia
Yes No
Unemp a b
Emp c d
Copy and paste the table into a new sheet, and name it Relative Risk Unemployment. Record your interpretation of the strength of this association, and significance, if any.
STEP 4
1. To conduct the stratified analysis for each of the confounders, you would need to go back to the Pivot Table, and modify its structure. The essential step lies in extracting stratum specific 2 x 2 tables, and calculating the Relative Risks for pneumonia according to mothers’ education separately according to each stratum of the potential confounder, for example, for each set of mothers who had only 1 child, then for mothers who had 2 children, and so on. Supplement 1 on Pivot Tables has additional material on how to modify the Pivot Table to yield stratum specific 2 x 2 tables.
2. Repeat the steps from above to calculate the relative risks and 95% CI for each stratum. Remember to copy and paste the 2 X 2 tables for each stratum into a new worksheet, show your calculations here, and name it Number of Children.
3. After you compute all the indicators, summarise the RRs and 95% CIs into a table on the same worksheet, with the following configuration:
Number of children RR (lower bound) RR (point estimate) RR (upper bound)
All children
1
2
3 (and so on)
4. Using the data in the 2 X 2 tables, and the formula in Supplement 2, calculate the adjusted (unconfounded) estimate of the Relative Risk. Record it next to the table of results. Compare the crude RR (for all children) with this adjusted RR; and refer to the criteria provided at the end of Supplement 2, to interpret the presence and direction of confounding.
5. ‘Eyeball’ these results, and record your interpretations in a text boxbelow the table, as to whether the number of children in the household confounds or does not confound the observed association between level of mothers’ education and incidence of pneumonia.
STEP 5
1. Repeat the stratified analysis for the second association; i.e whether level of education confounds the observed association between maternal unemployment and incidence of pneumonia. You can modify the same pivot table; and copy and paste your resultant stratum specific 2 X 2 tables into a new sheet, and name it Maternal Unemployment.
2. In a similar manner, calculate the adjusted RR, and record your interpretations on confounding in a text box on this sheet.
How to Calculate the Confidence interval for ratio measures
2 x 2 table notation for this section as follows:
Exposure Outcome Total
Disease Healthy
Exp d1 h1 n1
Unexp d0 h0 n0
d h n
Risk Ratio RR = Iexp / Iunexp
First calculate the logarithm of risk ratio using delta method
s.e.log?RR= v(1/d1-1/n1+1/d0-1/n0)
Then calculate the Error Factor
EF=e^(1.96×s.e.Log RR)
Calculate the Confidence Interval of RR
95% CI (RR)=RR/EF to RR×EF
Where RR/EF = lower bound of 95% CI
And RR X EF = upper bound of 95% CI
ADJUSTED RR
The following steps describe the process to calculate a pooled ‘unconfounded’ estimate of relative risk from stratified data.
Consider an example of data distributed across four levels of a confounder. This will take the form of four 2 x 2 tables as follows:
Level 1: Level 2
Exposure Outcome
Yes No
Exp a 1 b 1
Unexp c 1 d 1
T1
Exposure Outcome
Yes No
Exp a 2 b 2
Unexp c 2 d 2
T2
Level 3 Level 4
Exposure Outcome
Yes No
Exp a 3 b 3
Unexp c 3 d 3
T3
Exposure Outcome
Yes No
Exp a 4 b 4
Unexp c 4 d 4
T4
Adjusted RR = ((a1*d1)/T1+ (a2*d2)/T2+ (a3*d3)/T3+ (a4*d4)/T4)/((b1*c1)/T1+ (b2*c2)/T2+ (b3*c3)/T3+ (b4*c4)/T4)
The above formula can be extended to any number of strata.
INTERPRETATION OF CONFOUNDING BASED ON COMPARISON OF CRUDE AND ADJUSTED RR
No confounding: Crude RR ˜Adjusted RR
Positive confounding: Crude RR >Adj RR
Negative confounding: Crude RR <Adj RR
Example:
N_Children 1
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 158 788 946
College 125 854 979
Grand Total 283 1642 1925
N_Children 2
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 264 1166 1430
College 172 1007 1179
Grand Total 436 2173 2609
N_Children 3
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 222 949 1171
College 136 649 785
Grand Total 358 1598 1956
N_Children 4
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 170 624 794
College 70 399 469
Grand Total 240 1023 1263
N_Children 5
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 95 330 425
College 48 187 235
Grand Total 143 517 660
N_Children 6
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 81 212 293
College 23 115 138
Grand Total 104 327 431
N_Children 7
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 37 132 169
College 19 70 89
Grand Total 56 202 258
N_Children 8
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 19 59 78
College 9 34 43
Grand Total 28 93 121
N_Children > = 9
Count of Pid Column Labels
Row Labels 1 0 Grand Total
Mother Education Primary/High School 25 104 129
College 11 48 59
Grand Total 36 152 188
Adjusted RR = (70.1+ 101.9+ 73.7+ 53.7+26.9+21.6+10.0+5.3)/(51.2+ 76.9+66.0+ 34.6+ 24.0+11.3+9.7+5.3)=1.3