0821

Inter-rater Reliability and Translational Implications of MR-based Polycystic Kidney Volume Measurements by Stereology at Early and Late Stage Disease
Rebecca J. Lepping1, Rainer T. Karcher1, Paul Keselman1, Darren P. Wallace2, Alan Yu2, Laura E. Martin1,3, and William M. Brooks1,4

1Hoglund Brain Imaging Center, University of Kansas Medical Center, Kansas City, KS, United States, 2Medicine-Nephrology, University of Kansas Medical Center, Kansas City, KS, United States, 3Preventive Medicine and Public Health, University of Kansas Medical Center, Kansas City, KS, United States, 4Neurology, University of Kansas Medical Center, Kansas City, KS, United States

Synopsis

Autosomal dominant polycystic kidney disease (ADPKD) is characterized by the presence of fluid-filled cysts that grow over time. Total kidney volume (TKV) is one of the main biomarkers of disease progression, and is estimated through the technique of stereology. We tested whether patients’ kidney size impacted inter-rater reliability using the same stereology protocol. Stereology yielded excellent inter-rater reliability at both early and late stage disease, however, some pathology would still benefit from expert guidance in determining kidney tissue. This technique can easily be translated to animal models of ADPKD.

Introduction

Autosomal dominant polycystic kidney disease (ADPKD) is characterized by the presence of fluid-filled cysts1,2. Micro- and macroscopic cysts contribute directly to kidney damage by exerting pressure on healthy parenchyma, and may become infected or malignant, requiring surgical removal. Previous studies in ADPKD have shown that total kidney volume (TKV; reflecting larger macroscopic cysts) is a good predictor of risk of developing kidney insufficiency (Figure 1).

Standard kidney T1 and T2-weighted imaging for PKD provides excellent contrast for determining TKV by stereology and boundary tracing. Stereological approaches that involve overlaying a grid onto the image and counting intersection points are the current gold standard (CRISP3). Volume is estimated as the product of area per grid space, number of points selected, and slice thickness.

This project had two goals. First, since the number of points selected in the grid increases as kidney size increases, we hypothesized that kidney size (i.e. disease stage) may impact accuracy and inter-rater reliability given a common grid size. Second, we hypothesized that rater experience might influence the results. We conducted two inter-rater reliability tests with observers of varying expertise on small early stage, and large later stage kidneys, to determine whether relative kidney size and experience impacted inter-rater reliability.

Methods

Rating: Stereology was performed on coronal T2-weighted MR images. A 100 mm2 square grid was overlaid on the images, and the intersection points over kidney tissue were marked on each slice (Figure 2) (ImageJ 1.48v; Rasband, NIH). Total volume was estimated as (#points*100*slice thickness)/1000.

Inter-rater reliability: We performed two inter-rater reliability analyses. In the first, five early stage PKD patients were rated by two observers - one with substantial experience in MR, but not expert in PKD pathology (RL), and one relatively inexperienced in both MR and PKD (RK) - each blinded to the other rater’s analyses.

In the second analysis, eighteen late stage PKD patients were included. The two raters from the first analysis were joined by an experienced nephrologist/PKD expert (AY), also blinded to previous analyses.

Inter-rater reliability of TKV measurements across subjects was calculated using a two-way mixed, absolute, average-measures intraclass correlation coefficient (ICC) (SPSS 22; IBM, Armonk, NY)4. Inter-rater reliability was deemed ‘Poor’ if less than .40, ‘Fair’ if between .40 and .59, ‘Good’ if between .60 and .74, and ‘Excellent’ if greater than .755.

Results

Early stage: The difference in TKV was <1% between the two observers (1-2, M=.02%, SD=.06%, t(9)=-0.69, p=.51). The intra-class correlation coefficient was .988 indicating excellent inter-rater agreement (Table 1).

Late stage: The difference in TKV was <1% between Raters 1 and 2 (1-2, M=-0.13%, SD=7.12%, t(35)=0.42, p=.68), but both significantly underestimated TKV compared to the PKD expert Rater 3 (3-1, M=8.83%, SD=3.78%, t(35)=8.21, p<.001; 3-2, M=9.03%, SD=6.38%, t(35)=5.66, p<.001). However, the intra-class correlation coefficient for the three raters of .988 indicated excellent inter-rater agreement (Table 1).

Discussion

Both the early and late stage datasets yielded excellent inter-rater reliability for stereology5,6. Compared to the PKD expert, the other raters had lower estimates. Visual inspection of the images with the largest differences revealed that Raters 1 and 2 had excluded specific cysts. In addition to a wider range of TKV in the late stage, we also observed the data to be slightly positively skewed, indicating that some volumes were substantially greater than the mean. This was not observed for the early stage data.

Conclusion

We conclude that grid size of 100 mm2 is sufficient to yield high inter-rater reliability of the stereology method, regardless of kidney size. Additionally, the raters in our study were of widely ranging levels of expertise in PKD pathology. The inter-rater reliability measures in our study were taken from independent ratings by each observer – there were no discussions of the individual datasets prior to the ratings being made. Such high ICC between raters, even with this conservative approach, suggests that the method is robust, and does not require expertise in PKD or radiology. The significantly more conservative estimates from non-expert raters points to the need for expert guidance in cases where pathology makes tissue identification difficult.

Finally, the reliability of this method has implications for translational studies. Using 9.4T T2-weighted MRI, we showed that MR could be used to quantify TKV in a murine model of ADPKD, and that TKV could accurately be monitored using MR7. Both macro- and microscopic cysts are visible on the MR images, and volumes are easily determined using similar methods to those developed for humans (Figure 3). Current studies are underway to track treatment response in both mice and humans.

Acknowledgements

The Hoglund Brain Imaging Center, the Kansas PKD Research and Translation Core Center, and the authors are funded by the National Institutes of Health (P30 DK106912, R21 DK104086, R01 DK081579, S10 RR29577, UL1 TR000001), and by generous donors. The content is solely the authors' responsibility.

References

1. Chapman AB, Guay-Woodford LM, Grantham JJ, et al. Renal structure in early autosomal-dominant polycystic kidney disease (ADPKD): The Consortium for Radiologic Imaging Studies of Polycystic Kidney Disease (CRISP) cohort. Kidney International. 2003;64(3):1035-1045.

2. Grantham JJ, Mulamalla S, Grantham CJ, et al. Detected renal cysts are tips of the iceberg in adults with ADPKD. CJASN. 2012;7(7):1087-1093.

3. Chapman AB, Wei W. Imaging approaches to patients with polycystic kidney disease. Seminars in Nephrology. 2011;31(3):237-244.

4. Hallgren KA. Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol. 2012;8(1):23-34.

5. Cicchetti D. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess. 1994;6(4):284-290.

6. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychological Methods. 1996;1(1):30-46.

7. Wallace DP, Hou YP, Huang ZL, et al. Tracking kidney volume in mice with polycystic kidney disease by magnetic resonance imaging. Kidney International. 2008;73(6):778-781.

Figures

Figure 1. Coronal T2-weighted MR images of an early stage PKD patient (left) and a late stage PKD patient (right).

Figure 2. Stereology grid overlaid on a T2-weighted image with points selected.

Table 1. Volume estimates from each rater.

Figure 3. a) Histology and b) T2 MRI of mouse cystic kidney at 9.4T7.

Proc. Intl. Soc. Mag. Reson. Med. 25 (2017)
0821