Skip to main content

Table 3 SEER test-retest reliability – Estimates of intraclass correlation coefficients (ICC) and Cohen’s kappa coefficient

From: Development and validation of SEER (Seeking, Engaging with and Evaluating Research): a measure of policymakers’ capacity to engage with and use research

Factor (items, scoring, possible range)

Test 1

Test 2

Test-retestc

Organisationc

Weighteda

Higher scores indicate greater capacity, more research engagement actions and use

Mean (SD) [n] or response options

IQR or freq. (%)

Mean (SD) [n] or response options

IQR or freq. (%)

ICC

(95% CI)

ICC

(95% CI)

Kappa

(95% CI)

Capacity – Predisposing factors

1. Value individual places on using research (7 items, summed, range 7–35)

29 (3.7) [143]

26–32

29 (3.8) [57]

27–31

0.59

(0.40–0.75)

0

   

2. Confidence in using research (7 items, summed, range 7–35)

25 (6.0) [144]

22–28

24 (5.6) [57]

23–27

0.85

(0.76–0.91)

0.05

(0.00–0.47)

  

3. Value organisation places on using research (5 items, summed, range 5–25)

19 (3.5) [144]

18–21

19 (3.7) [57]

16–21

0.76

(0.63–0.85)

0.13

(0.03–0.44)

  

4. Tools and systems organisation has to support research use (7 items, summed,d range 7–21)

14 (3.6) [144]

11–16

13 (3.6) [57]

10–16

0.70

(0.49–0.85)

0.48

(0.23–0.73)

  

Research engagement actions

5. Accessed synthesised research (two items, binary – yes/noe)

Yes No

112 (79) 30 (21)

Yes No

44 (79) 12 (21)

    

0.40

(0.10–0.69)

6. Accessed primary research (two binary items, summed, ordinal – 0, 1, 2)

0 1 2

20 (14) 27 (19) 95 (67)

0 1 2

6 (11) 10 (18) 40 (71)

    

0.49

(0.21–0.75)b

7. Appraised research (three binary items, summed, ordinal – 0, 1, 2, 3)

0 1 2 3

10 (8) 7 (6) 15 (12) 89 (74)

0 1 2 3

3 (6) 6 (12) 7 (14) 34 (68)

    

0.34

(0.04–0.69)b

8. Generated research (three binary items, coded yes if response to any item is yes, binary – yes, no)

Yes No

107 (76) 33 (24)

Yes No

39 (70) 17 (30)

    

0.39

(0.12–0.66)

9. Interacted with researchers (6 items, summed, range 6–24)

12 (4.7) [140]

8–15

11 (4.2) [55]

7–14

0.83

(0.66–0.92)

0

   

Research use

10. Extent of research use (4 items, choose item with highest score, range 1–6)f

5 (1.1) [140]

4–6

5 (1.2) [54]

4–6

0.65

(0.47–0.79)

0.14

(0.03–0.44)

  

11. Conceptual research use (one item, binary – yes, no)

Yes No

125 (89) 15 (11)

Yes No

51 (93) 4 (7)

    

0.24

(−0.22 to 0.69)

12. Instrumental research use (one item, binary – yes, no)

Yes No

119 (85) 21 (15)

Yes No

50 (91) 5 (9)

    

0.49

(0.11–0.88)

13. Tactical research use (one item, binary – yes, no)

Yes No

117 (84) 23 (16)

Yes No

46 (84) 9 (16)

    

0.15

(−0.18 to 0.47)

14. Imposed research use (one item, binary – yes, no)

Yes No

66 (47) 74 (53)

Yes No

24 (44) 31 (56)

    

0.43

(0.18–0.67)

  1. ICC intraclass correlation coefficient, SD standard deviation, IQR inter-quartile range, n sample size
  2. aWeighted kappa using quadratic weights. Weights indicate the ‘degree’ of agreement. For example, with an ordinal variable with four values (0, 1, 2, 3), the weights are 0.8889, 0.5556, and 0 for a distance of one, two, and three apart, respectively. For example, if a participant’s factor score is 1 on the first application of the survey and 2 on the second application, they are 0.8889 in agreement. With no a prior rationale for a particular weighting scheme, this quadratic scheme is recommended [27]. In addition, a kappa statistic calculated using this weighting scheme will yield the same estimate as an ICC. bConfidence interval for kappa calculated using bootstrapping. Bias corrected 95% confidence intervals were calculated from 1000 replicates. cICCs calculated from fitting a random effects model with two random effects (participant and organisation). This provides a measure of absolute agreement. dResponse options are: no (1), yes but limited (2), yes well developed (3), and I don’t know. For the purpose of psychometric testing, “I don’t know” was recoded as “no” rather than as a missing value. This assumes that “I don’t know” indicates that systems/tools are unlikely to function as a predisposing factor (i.e. motivating research engagement) if staff are unaware of their existence. eSingle score of ‘yes’ for factor if respondent answered ‘yes’ to either or both items. fScore for this factor is the highest score from the four items (each reflecting a different stage of policy work: agenda setting/scoping, policy or programme development, policy or programme implementation, policy or programme evaluation). Respondents can respond ‘not applicable’ (0) for stages they have not covered, but at least one stage should be applicable so the minimum score for variable is 1