E-ISSN 2218-6050 | ISSN 2226-4485
 

Research Article


Open Veterinary Journal, (2025), Vol. 15(12): 6721-6737

Research Article

10.5455/OVJ.2025.v15.i12.52


Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction

Mason White*, Andrew Adezio, Katharina Flatz, Karon Hoffmann, Alexandra Kennedy, Richard Lam, Michelle Lau, Jia Wen Siow and Joanna White

Small Animal Specialist Hospital, North Ryde, Australia

*Corresponding Author: Mason White. Small Animal Specialist Hospital, North Ryde, Australia. Email: mwhite [at] sashvets.com

Submitted: 01/10/2025 Revised: 07/11/2025 Accepted: 13/11/2025 Published: 31/12/2025


Abstract

Background: Sonographic measurement of the renal pelvis is often used in the diagnosis and management of ureteral obstruction. However, much of the veterinary literature fails to detail the measurement techniques utilized.

Aim: This study of renal pelvis assessment methods aimed to compare the interobserver variability of subjective assessments with that of standardized, objective measurements.

Methods: Sonographic images of the renal pelvis from 15 cats with fluoroscopically confirmed ureteral obstruction were reviewed by six independent observers using 11 methods. Subjective (3) and objective (8) assessments of renal pelvic size in the dorsal, sagittal and transverse planes were created (45 images). The intraclass correlation coefficient (ICC) was calculated to assess interobserver agreement for each measurement method. A two-way random effect model and a single rater (quantitative) and Fleiss (categorical) kappa were calculated.

Results: Subjective assessment methods had the lowest interobserver agreement in every plane (transverse: κ=0.326, sagittal: κ=0.311, dorsal: κ=0.473). The standardized measurements with the highest interobserver agreement measured the distance between the renal crest and the ureter in dorsal (ICC=0.91) and transverse planes (ICC=0.907).

Conclusion: Subjective assessment of pyelectasia is variable, and standardized measurements should be used when sonographically assessing renal pelvic size. Consistent use of the method with the greatest interobserver agreement is likely to improve comparability between studies and ensure appropriate implementation of their findings in a clinical setting.

Keywords: Hydronephrosis, Measurement, Pelvis, Pyelectasia, Renal.


Introduction

Sonographic evaluation of the renal pelvis plays an important role in the diagnosis of many renal diseases, including obstructive uropathies. Mild distension (‟pyelectasia”) measuring up to 2–3 mm can occur with intravenous fluid administration (Jakovljevic et al., 1999; D’Anjou et al., 2011; Debruyn et al., 2012); however, measurements greater than this are considered pathological (D’Anjou et al., 2011; Quimby et al., 2017; Lemieux et al., 2021). Research examining the utility of renal pelvis measurements has delivered mixed results when attempting to differentiate obstructed from non-obstructed ureters (D’Anjou et al., 2011; Quimby et al., 2017; Fages et al., 2018; Cole et al., 2019; Lemieux et al., 2021). One study of azotaemic cats found no significant differences in renal pelvic size with or without obstructed ureters (Lamb et al., 2018). Another found that 26% of feline kidneys with pyelographically confirmed ureteral obstructions had renal pelvises <4 mm, with 8% even measuring <2 mm - below the threshold for pathological distension (Lemieux et al., 2021). Conversely, two studies found that a renal pelvic width ≥13 mm was only seen in animals with obstructive uropathies, and that transverse plane renal pelvic measurements were significantly larger in animals with ureteral obstruction than in animals with several other conditions (e.g., chronic kidney disease and conditions causing diuresis) (D’Anjou et al., 2011; Quimby et al., 2017). Neither study found significant differences in renal pelvis size when comparing animals with ureteral obstruction or pyelonephritis in the transverse plane, though one study did find significant differences when measuring in the sagittal plane (Quimby et al., 2017). This suggests that measurements obtained in different planes might have different diagnostic utilities and highlights the need to consider the plane of insonation and method of measurement when comparing different studies.

Objective sonographic measurements of the renal pelvis are commonly utilized in the pre-operative planning of surgical subcutaneous ureteral bypass (SUB) placement. The American College of Veterinary Internal Medicine Small Animal Consensus Recommendations on the Treatment and Prevention of Uroliths in Dogs and Cats recommend that ureterolith-induced ureteral obstructions causing renal pelvis dilation ≤3–5 mm be monitored instead of surgically decompressed (Lulich et al., 2016). Renal pelvic size also influences the surgical technique for SUB placement, with smaller pelvises (<5–10 mm) often necessitating fluoroscopic guidance (Livet et al., 2017) and/or advancement of the nephrostomy tube into the proximal ureter (Berent et al., 2012; Berent and Weisse, 2018; Berent et al., 2018; Berent and Weisse, 2020). The method by which these measurements are performed is not clearly defined in these sources. Renal pelvic size is also commonly assessed during post-operative monitoring of these patients. Cats with renal pelvises measuring <5 mm after surgery were substantially less likely to have postoperative obstructive complications (Fages et al., 2018). This threshold provides a potentially useful prompt to investigate for an obstructive complication, but could lead to inappropriate decision-making if the renal pelvis is measured using a different method than that utilized in this study.

Despite the commonality and potential utility of these measurements, there is no formally standardized way of measuring the renal pelvis in the veterinary literature. The transverse plane is commonly utilized; however many studies provide little description of the exact measurement technique, and only a minority explain calliper placement either texturally (Nevins et al., 2015) or pictorially (D’Anjou et al., 2011; Gould et al., 2016; Fages et al., 2018). Some studies have utilized measurements or grading systems using different angles of insonation, with or without instructions on calliper placement (Carpenter et al., 2012; Taylor et al., 2014; Quimby et al., 2017). As much of the literature does not specify the precise method of renal pelvis measurement, there is no way of definitively knowing whether the results of these previous studies are comparable. This lack of standardization might contribute to the disparity in the literature, as well as lead clinicians to draw inappropriate conclusions from sonographic measurements performed in practice. A standardized measurement of renal pelvis distension would allow for more precise replication in a clinical setting and help to improve the comparability of the results of future research.

This prospective study aimed to compare methods of measuring the renal pelvis on static sonographic images and determine which resulted in the lowest interobserver variability. Our hypothesis was that objective measurements would have a lower interobserver variability than subjective assessments.


Materials and Methods

Case selection and inclusion

Medical records and sonographic studies of feline patients with fluoroscopically confirmed ureteral obstruction in at least one kidney were obtained from a cohort of animals recruited in a previous study of risk factors for ureteral obstruction (Kennedy and White, 2022). All records were reviewed, and initial decisions on case inclusion were made by a resident in veterinary diagnostic imaging (MW). Data extracted from these medical records included date of admission to hospital, date and time of each ultrasound examination, and date and side of the SUB placement (left, right, or bilateral).

Images from the final ultrasound examination prior to SUB placement were reviewed. For all images of the kidneys, the side of the kidney and plane of insonation (dorsal, sagittal, transverse) were recorded. To standardize all images, the planes of insonation were defined by the following criteria, regardless of the included label:

- Dorsal plane: a long-axis image producing a classic ‟bean-shaped” profile that is approximately symmetrical in one plane.

- Sagittal plane: a long-axis image producing an approximately ‟dumbbell-shaped” or ‘hourglass-shaped’ profile that is approximately symmetrical in two planes.

- Transverse plane: a short-axis image producing a rounded, near-circular profile that is approximately symmetrical in one plane.

For cine-loops, the most representative frame was selected for consideration as an individual, still image.

Image selection

Images were excluded if:

- They included measurements or callipers in locations that might bias retrospective measurement, including anywhere within the renal pelvis.

- They were obtained at an oblique angle or uncertain angle of insonation, such that they did not clearly fit into any of the above-defined categories.

- The renal pelvis was not clearly visible.

From each kidney in every cat, the single best image for diagnostic interpretation in each plane was selected. This was based on minimizing obliquity of the angle of insonation and maximizing visibility of the renal pelvis. All other images were excluded.

Measurement selection

Renal pelvis measurement methods were selected from veterinary and human literature. Each technique was named and defined according to the plane of insonation and the location of callipers placement. Eleven evaluation methods were assessed in total. Having parentheses within brackets is challenging to read. I propose simply listing the abbreviated measurement names (e.g. TCU, TDV, TPW) after each plane as in the original manuscript. The reference to Table 1 at the end of the paragraph invites the readers to read the more exensive definition of each measurement.

Table 1. Methods selected for measuring the renal pelvises on still sonographic images of the kidneys, acquired from 15 cats with fluoroscopically confirmed ureteral obstruction.

E.g. Three measurements were defined in the transverse plane (TCU, TDV, TPW) and sagittal plane (SPW, SPL, SPP), and two were defined in the dorsal plane (DCU, DPW). Subjective (unmeasured) severity gradings of pyelectasia (none or normal, mild, moderate, marked) (TSA, SSA), DSA) were included from all three planes (Table 1).

Image categorization and anonymization

Images were imported using commercially available DICOM viewing software (OsiriX® version 11.0.4, Pixmeo, Switzerland) and organized into a ‘transverse plane group,’ ‘sagittal plane group,’ or ‘dorsal plane group’ based on the plane of insonation as above.

Six observers were recruited to independently assess the renal pelvis: three European College of Veterinary Diagnostic Imaging-boarded veterinary radiologists (KF, KH, RL), one American College of Veterinary Radiology-boarded veterinary radiologist (AA), and two Australian and New Zealand College of Veterinary Scientists residency-trained clinicians in veterinary radiology (ML, JWS). Each image was duplicated and categorized such that each observer could perform each measurement once on each image, blinded to other observers’ measurements. All images utilizing a common assessment method (subjective or objective) were collectively termed ‟measurement groups” (Fig. 1).

Fig. 1. Flow chart demonstrating how 45 images were duplicated and allocated so that each observer could measure each image once using each measurement method. Sets of 15 Transverse Images, 20 Sagittal Images, and 10 Dorsal images were duplicated to create multiple ‟Measurement Groups”—4 Transverse Plane, 4 Sagittal Plane, and 3 Dorsal Plane. Each measurement group within a plane was identical, but was assigned a different method of assessment. Each of the six observers assessed every image in a ‟Measurement Group” with its assigned method. Note: Measurement groups are represented to best demonstrate how images were duplicated and assigned. This figure does not represent the order in which they were assessed.

Patient data was anonymized in OsiriX. Patient identification data that was saved onto the image file could not be removed; however, to avoid bias, observers were blinded to all images and measurements other than the one being assessed/performed at any given time. To avoid bias when making subjective assessments, these were made prior to the specified, objective measurements. The order of assessment was as follows: TSA, SSA, DSA, TCU, TDV, TPW, SPW, SPL, SPP, DCU, and DPW.

Measurement

Observers measured the renal pelvis in each image using each applicable method as described in a provided set of instructions (Supplementary 14).

Exclusion criteria

To ensure that only clinically relevant images of sufficient quality were included in the analysis, a structured exclusion process was applied.

First, observers were asked to ‟reject” any images that they considered inappropriate for measurement for any reason and then make the measurement to the best of their ability. Reasons for rejection might include perceived obliquity of the angle of insonation, insufficient visibility of the renal pelvis, inability to visualize landmarks required for callipers placement, or any other reason that a clinician might decline to measure an image in a clinical setting. Images rejected by ≥3 observers were excluded from statistical analysis.

Second, in cats that underwent only unilateral SUB placement, the contralateral kidney was excluded. Any kidney where additional pathology was identified intra-operatively was also excluded from statistical analysis. This ensured that all images came from kidneys with fluoroscopically confirmed ureteral obstruction.

Third, to ensure consistency and avoid including more than one kidney per patient in a measurement group, when a measurement group contained images from a patient’s left and right kidney, the kidney that had been more often excluded from other measurement groups was also excluded from this group. This ensured that the images reviewers more often favored were included. In the event of a tie, where exclusions were evenly split between the left and right kidney, the total number of observer rejections for each kidney’s image was tallied, and the kidney with the highest cumulative rejection count was excluded. If this still did not resolve the issue, the left kidney was excluded arbitrarily (Supplementary 5).

Statistics

Statistical analyses were selected and performed by a PhD-qualified, ACVIM-certified Internal Medicine Specialist (JW). The intraclass correlation coefficient (ICC) was calculated to assess the agreement between six radiologists’ measurements of pelvic dimensions in each of the measurement groups.*,† The two-way random effect model and “single rater” unit “kappa” were used for quantitative measurements, and Fleiss kappa was calculated for categorical assessments. For each calculated ICC or kappa value, agreement was interpreted as poor (<0.5), moderate (0.5–0.75), good (0.75–0.9), or excellent (>0.9).

Ethical approval

Not needed for this study.


Results

Image selection

Of the 120 renal images assessed for initial inclusion, images were excluded due to on-image measurements (42), oblique/uncertain plane of insonation (15), poor visibility of the renal pelvis (6), and the presence of a superior image of the same kidney in the same plane (12). Forty-five images from 16 different patients were thus selected for measurement: 15 transverse plane, 20 sagittal plane, and 10 dorsal plane images.

Four transverse plane, four sagittal plane, and three dorsal plane assessments (subjective and objective) were made on 15 transverse plane, 20 sagittal plane, and 10 dorsal plane images, respectively. After creating separate images so that each of the six observers could assess each image with each method, a total of 1,020 images (360 Transverse, 480 Sagittal, and 180 Dorsal), labelled and assigned to one of six observers, existed divided between 11 ‘measurement groups,’ to which a specific measurement or assessment technique was assigned (Fig. 1). Each observer was therefore assigned to assess a total of 170 images.

Exclusions

Across all 11 measurement groups, a total of 22 images were excluded by majority vote (≥3 observers). Thirty-five kidney images were excluded because ureteral obstruction was not confirmed fluoroscopically; 24 because the SUB was placed in the cat’s other kidney (confirming only unilateral obstruction), and 11 (one from each measurement group) because a concurrent ureteral tear was noted in one cat, confounding the diagnosis.

Thirty images were excluded to prevent multiple images from one patient from appearing in a measurement group (Table 2). All images from one cat were excluded by the above criteria, thus wholly excluding the cat from this study and bringing the total number of included cats to 15.

Table 2. Number of sonographic images excluded from each measurement group by each sequentially-applied exclusion criterion (left to right).

Results

The interobserver agreement was excellent for two (TCU, DCU), good for two (TPW, DPW), moderate for three (TDV, SPW, SPP), and poor for four assessment methods (TSA, SSA, DSA, SPL) (Table 3). The methods with the highest interobserver agreement in each plane were TCU (ICC=0.907) in the transverse plane, SPL (ICC=0.712) in the sagittal plane, and DCU (ICC=0.910) in the dorsal plane (Table 4). The methods with the best and worst interobserver agreement across all measurement groups were DCU (ICC=0.910) and SSA (κ=0.311), respectively (Table 3).

Table 3. ICC (quantitative measurements), kappa values (categorical measurements) and interpretation of interobserver agreement of renal pelvis measurement methods on still sonographic images. Data are separated into quantitative and categorical categories.

Table 4. Kappa values (categorical measurements) and ICC (quantitative measurements) of renal pelvis measurement methods on still sonographic images. Data are separated by measurement type and imaging plane.

Subjective assessments were ‟Poor” and had the lowest interobserver agreement in every plane (TSA: κ=0.326, SSA: κ=0.311, DSA: κ=0.473), though dorsal plane assessments were the least variable of the three (Table 4).

Standardized measurements had ‟Poor” to ‟Excellent” agreement between observers but were best when measuring between the renal crest and the ureter in the dorsal (DCU) and transverse (TCU) plane; DCU: ICC=0.910, TCU: ICC=0.907 (Table 4).


Discussion

The greatest overall interobserver agreement was achieved by standardized measurements between the renal crest and proximal ureter in the dorsal (DCU) and transverse (TCU) planes (Figs. 2a and b), even though anatomical definitions of the ‟ureteropelvic junction” are lacking in the veterinary and human medical literature (Stringer and Yassaie, 2013). One review of literature examining normal human kidneys concluded that a discrete pyeloureteral junction might not be present, and that in most individuals there might be a gradual zone of transition between the pelvis and ureter. Similar studies are lacking in the veterinary literature. While the unipyramidal morphology of the feline kidney differs from that of the multipyramidal kidney in humans, the embryological development is comparable (Hyttel et al., 2010), and a similar ‟pyeloureteral region” might also be expected in these patients. The ambiguous definition of the pyeloureteral junction might have less impact on interobserver agreement in specific images, however, if the angle of insonation is not perfectly centered and parallel to the long-axis of the proximal ureter. In this scenario, the medial cortical parenchyma or adjacent hilar fat may appear as a discrete, faint echo at the medial aspect of the collecting system due to slice thickness artifact. These echoes could provide a ‟pseudo-margin” for calliper placement, which could increase interobserver agreement. We attempted to minimize the effects of these phenomena by providing clear instructions and allowing the observers to reject images that they considered inappropriate for interpretation in a clinical setting. Our results, therefore, likely reflect the standard that would be encountered during retrospective measurement of still images in clinical practice. Furthermore, if a standardized measurement can be performed reliably and found to correlate with or differentiate between clinical entities, it has value as a diagnostic tool. In these circumstances, whether the region measured represents the precise anatomical boundaries of the renal pelvis is not as important as the reliability with which the measurement can be repeated between observers.

Fig. 2. Examples of the DCU (A) and TCU (B) measurement methods, which had the highest interobserver agreement of all methods examined in this study.

Overall, measurements from the sagittal plane performed worse than those from the dorsal and transverse planes, possibly due to the variable appearance of the renal pelvis in this plane; its margins may undulate, and it might be bisected by the renal crest depending on the level of insonation (Fig. 3). A study of murine kidneys proposed a hydronephrosis grading method that might mitigate the effects of this variation by calculating the percentage of renal height comprised of renal parenchyma (as opposed to renal pelvis) in the sagittal plane (Carpenter et al., 2012). Disappointingly, the interobserver variability of this method was found to be only ‘Moderate’ when applied to cats in our study (Fig. 4).

Fig. 3. Sagittal plane images of multiple kidneys, demonstrating the variable appearance of the renal pelvis. (A, C) The renal pelvis is not bisected by the renal crest. (B, E) The renal pelvis is completely bisected by the renal crest, which varies in thickness and margination between the images. (D) The renal crest incompletely bisects the renal pelvis, appearing as an amorphous echogenic structure in the center of the renal pelvis. Note the undulating margins of the distended renal pelvises in (C) and (D)

Fig. 4. Examples of the “SPP” measurement, adapted from methods proposed in murine kidneys by Carpenter et al., (2012). The longitudinal renal length is measured as a guideline (i). Perpendicular to this, the transverse renal width (ii), renal pelvis diameter (iii), and renal papilla width (iv) are measured, and the percentage of the renal height that is comprised of renal parenchyma (as opposed to pelvis) is calculated. Note that the renal papilla is not visible in (b), which would result in a lower overall percentage of renal parenchyma contributing to renal height in this image.

The standardized measurements with ‟excellent” interobserver agreement were between the renal crest and pyeloureteral junction in the transverse (TCU) and dorsal (DCU) planes. The TCU measurement approximated that which appears most commonly in the veterinary literature and is the most often utilized method at the authors’ institution; it is possible that this increased familiarity and experience with performing this measurement contributed to the high degree of interobserver agreement. Our finding that the TCU measurement produced excellent interobserver agreement validates comparisons between these previous studies and justifies its continued use for the sake of maintaining comparability between studies and clinical examinations. The similarity between the interobserver agreements for TCU and DCU measurements is also not unexpected; though performed in different planes, both methods direct observers to measure the distance between the same two anatomical locations.

Subjective assessments had ‟poor” interobserver agreement in every plane, but were best in the dorsal plane, possibly due to a greater ability to recognize distension of the diverticula (hydronephrosis). Subjective assessment in the sagittal plane yielded the lowest interobserver agreement of all studied assessment types, possibly due to the variable appearance and lesser familiarity with sagittal-plane renal pelvis assessment. Regardless, all subjective measurements demonstrated poor interobserver agreement, highlighting the need to provide a quantitative measurement when describing sonographically identified pyelectasia.

Overall, different measurement methods achieved different degrees of interobserver agreement. For accurate comparison between studies, clinicians utilizing ultrasound to measure the renal pelvis in either research or clinical settings should therefore document the measurement method used and make efforts to utilize the methods with the highest interobserver agreement, which, based on our findings, are the TCU and DCU measurements. It is fortunate that the measurement most often described in the veterinary literature (TCU) demonstrated excellent interobserver agreement in this study; however, this approach is not formally standardized, and measurement methods are not explicitly described in many studies. The situation is similar in human medicine; a number of different, standardized grading schemes have been proposed and implemented with limited consensus between disciplines (Onen, 2020). Hydronephrosis is often inconsistently graded in the field of pediatric urology/nephrology, usually for the assessment of pyeloureteral outflow tract obstruction (Riccabona et al., 2008; Nguyen et al., 2014; Suson and Preece, 2020). One literature review found that >20% of retrieved studies did not categorize hydronephrosis in any way (Suson and Preece, 2020).

The renal pelvis is a three-dimensional structure that is approximately saddle-like or funnel-like in shape; distension might not occur evenly between planes or similarly with different diseases. It is reasonable, therefore, to question whether single-dimensional measurements can appropriately and accurately reflect meaningful changes in the volume of such a complex shape. The most common way of quantitatively measuring the renal pelvis in people is by its Anterior-Posterior Diameter (APD), usually measured at the parenchymal edge (hilus) in the transverse plane (Onen 2020; Suson and Preece, 2020). However, studies of the APD have shown this measurement to lack standardization between disciplines, with consensus in only 64% of clinicians and intraobserver and interobserver variabilities of 5.2+/−3.5% and 9.3+/−9.7%, respectively (Pereira et al., 2011). It is also prone to variation with factors such as hydration, bladder filling, patient positioning, respiration, and different renal pelvic conformations (Pereira et al., 2011; Onen, 2020). A key limitation of renal ultrasound in the investigation of obstructive uropathies is its reliance on these indirect parameters, which, as outlined previously, yield conflicting results regarding diagnostic utility. Positive-contrast pyelography is considered the most accurate test for the diagnosis of ureteral obstruction in small animals, yielding a sensitivity and specificity of 100% in one study (Adin et al., 2003). It is also utilized in human medicine, along with multi-detector CT urography—the latter providing a precise definition of the nature, size, and location of the obstruction, an indication of the kidney’s function, and an opportunity to evaluate nearby structures (Khandelwal et al., 2023).

3D ultrasound is becoming increasingly utilized as a non-invasive way of overcoming the limitations of 2-dimensional sonography. In 3D-ultrasound, the beam is swept across the area of interest, accurately recording the echo information and relative anatomic position of each section (Downey et al., 2000). The area can then be assessed tomographically, viewed as a 3D-rendered image, and manipulated with various post-processing techniques to allow comprehensive visualization of the collecting system (Downey et al., 2000; Riccabona et al., 2003; Elwagdy et al., 2008). This is a significant advantage over conventional 2D ultrasound, in which patient anatomy, positioning, or individual variation in sonographic technique might disallow visualization of an area of interest or confound comparisons between acquisitions (Downey et al., 2000). This technique has been found to improve the ability to detect, localize, and characterize obstructive uropathies associated with urinary calculi, inflammation or neoplasia (Elwagdy et al., 2008), and has a superior capability to accurately estimate the absolute volume of both the renal pelvis and kidney as a whole (Riccabona et al., 1996; Strømmen et al., 2004; Hoyong, 2016) with excellent interobserver and intraobserver agreement (Raine-Fennings et al., 2003; Kim et al., 2010; Duin et al., 2013; Yoshizaki et al., 2013; Esser et al., 2025). Combined with endoluminal ultrasound, 3D imaging allowed superior assessment of pathology and revealed important anatomical variations in the surgical treatment of ureteropelvic obstruction–especially regional blood vessels (Lin et al., 2008; Zhu et al., 2021). This ultimately shortened procedure time (Zhu et al., 2021; Adanur et al., 2022), helped to minimize blood loss (Zhu et al., 2021), and combined with conventional 2D-endoluminal ultrasound, helped to modify the interventional procedure in 18.6% of patients (Lin et al., 2008).

3D ultrasound, therefore, appears to be a promising diagnostic and surgical-planning tool in patients with obstructive uropathies, as it may overcome many of the limitations encountered when relying on one-dimensional or two-dimensional measurements. Nevertheless, its use remains rare in veterinary practice, and there is a paucity of literature evaluating its utility in feline urinary tract disease, warranting further investigation.

The clinical utility of renal pelvic measurements depends upon their repeatability as well as their correlation with or ability to differentiate between clinical diseases. Correlation with pathology was beyond the scope of our study, but has been the topic of much prior research.

To improve comparability between publications, the most reliable methods identified in our study should be included in future research on this topic. Additionally, an understanding of the repeatability of measurements used to assess these anatomical structures is beneficial when selecting surgical technique and implant size. The degree to which interobserver variability contributes to measurements of renal pelvic size is also likely to help clinical decision making during post-operative patient monitoring, and the least variable method should be prioritized for this use.

This study had some limitations. Images considered best for interpretation were pre-selected based on the visibility and obliquity of the renal pelvis. Although this introduces potential selection bias, it avoids the need for observers to review all available images from every study, a process that would have been prohibitively time-consuming, and requires additional subjective criteria for imaging selection.

Another limitation is that only previously acquired, still images were assessed, though real-time imaging of the renal pelvis may provide the sonographer with an overall impression of the renal pelvis that cannot be captured in a single image. Similarly, the level of insonation is likely to account for a large degree of the variability in the subjective appearance and objective measurement of the renal pelvis. Future studies should focus on assessing the degree of variability that can be attributable to differences in image acquisition.

The fact that all observers work at the same institution is a limitation that could not be overcome for this study. It is expected that some degree of similarity will arise between clinicians working in the same institution, and this may have increased the similarity between subjective assessments as well as affected the way that measurement instructions were interpreted. Unfortunately, the degree to which the results were affected cannot be quantified.

Finally, this study required each observer to assess each image using each assessment method only once. Some variability identified in this study might result from small inconsistencies in the way each individual performed their assessments. Though assessing intraobserver variability was beyond the scope of this study, this could be the focus of further research.


Conclusions

In conclusion, the renal pelvis is a complex, three-dimensional structure, and documenting the method of measurement is important to ensure comparability between examinations and studies. Subjective assessments of renal pelvis size had ‟Poor” interobserver agreement and should be avoided when describing pyelectasia. Measurements between the renal crest and pyeloureteral junction in the dorsal and transverse planes were the only methods to achieve ‟Excellent” interobserver agreement and should be favored when measuring the renal pelvis in cats with ureteral obstruction. However, there is conflicting literature on their utility when differentiating obstructive from non-obstructed ureters. Future research should focus on how image acquisition contributes to interobserver variability of these renal pelvis measurements, as well as how they relate to clinical disease.


Acknowledgments

The authors wish to thank Dr. Shelley Wo for helping to test the process by which examiners accessed instructions and made digital measurements.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Author contributions

Mason White

- Conceptualization: Formulation of research goals

- Data Curation: Management of measurement data

- Methodology: Development of study methodology, selection of images, and creation of image sets for assessment.

- Project Administration: Management of images and measurement data, and logistical coordination of measurement data acquisition.

- Supervision: Oversight of measurement data acquisition.

- Visualization: Preparation of figures, tables, and supplementary materials.

- Writing: Preparation of the original draft.

Andrew Adezio

- Investigation: Performing measurements for analysis and comparison.

Katharina Flatz

- Investigation: Performing measurements for analysis and comparison.

Karon Hoffmann

- Investigation: Performing measurements for analysis and comparison.

Alexandra Kennedy

- Resources: Provision of patient cohort for patient and image data acquisition.

Richard Lam

- Investigation: Performing measurements for analysis and comparison.

Michelle Lau

- Investigation: Performing measurements for analysis and comparison.

Jia Wen Siow

- Investigation: Performing measurements for analysis and comparison.

Joanna White

- Conceptualization: Formulation of research goals

- Formal analysis: Statistical analysis of measurement data for interpretation

- Methodology: Development of study methodology.

- Supervision: Oversight and mentorship of study design and data interpretation.

- Writing: Review of original drafts, including commentary and revision recommendations.

Conflicts of interest

The Authors declare that there is no conflict of interest.

Funding

This research received no specific grant.

Data availability

The data that support the findings of this study are not openly available due to reasons of sensitivity and are available from the corresponding author upon reasonable request.


References

Adanur, S., Demirdogen, S.O., Altay, M.S. and Polat, O. 2022. Comparing the effects of 2D and 3D imaging systems on laparoscopic pyeloplasty outcomes in the treatment of adult ureteropelvic junction obstruction. J. Laparoendosc. Adv. Surg. Tech. 32, 1043–1047.

Adin, C.A., Herrgesell, E.J., Nyland, T.G., Hughes, J.M., Gregory, C.R., Kyles, A.E., Cowgill, L.D. and Ling, G.V. 2003. Antegrade pyelography for suspected ureteral obstruction in cats: 11 cases (1995–2001). J. Am. Vet. Med. Assoc. 222, 1576–1581.

Berent, A. and Weisse, C. 2018. SUBTM 2.0: A Surgical Guide. Available via https://norfolkvetproducts.com/PDF/SUB/SUB2_Surgical_Guide_2018-03-email.pdf

Berent, A. and Weisse, C. 2020. SUBTM 3.0: a surgical guide. Available via https://norfolkvetproducts.com/wp-content/uploads/2020/10/SUB3_Surgical_Guide_2020-09-email.pdf

Berent, A.C., Weisse, C.W., Bagley, D.H. and Lamb, K. 2018b. Use of a subcutaneous ureteral bypass device for treatment of benign ureteral obstruction in cats: 174 ureters in 134 cats (2009-2015). J. Am. Vet. Med. Assoc. 253, 1309–1327.

Berent, A.C., Weisse, C.W., Todd, K.L. and Bagley, D.H. 2012. Use of locking-loop pigtail nephrostomy catheters in dogs and cats: 20 cases (2004-2009). J. Am. Vet. Med. Assoc. 241, 348–357.

Carpenter, A.R., Becknell, B., Ingraham, S.E. and McHugh, K.M. 2012. Ultrasound imaging of the murine kidney. Methods. Mol. Biol. 886, 403–410.

Cole, L.P., Mantis, P. and Humm, K. 2019. Ultrasonographic findings in cats with acute kidney injury: a retrospective study. J. Feline Med. Surg. 21, 475–480.

D'Anjou, M.A., Bédard, A. and Dunn, M.E. 2011. Clinical significance of renal pelvic dilatation on ultrasound in dogs and cats. Vet. Radiol. Ultrasound. 52, 88–94.

Debruyn, K., Haers, H., Combes, A., Paepe, D., Peremans, K., Vanderperren, K. and Saunders, J.H. 2012. Ultrasonography of the feline kidney: technique, anatomy and changes associated with disease. J. Feline Med. Surg. 14, 794–803.

Downey, D.B., Fenster, A. and Williams, J.C. 2000. Clinical utility of three-dimensional US. RadioGraphics 20, 559–571.

Duin, L.K., Willekes, C., Vossen, M., Offermans, J. and Nijhuis, J.G. 2013. Reproducibility of fetal renal pelvis volume assessed by three-dimensional ultrasonography with two different measurement techniques. J. Clin. Ultrasound 41, 230–234.

Elwagdy, S., Ghoneim, S., Moussa, S. and Ewis, I. 2008. Three-dimensional ultrasound (3D US) methods in the evaluation of calculous and non-calculous ureteric obstructive uropathy. World. J. Urol. 26, 263–274.

Esser, M., Tsiflikas, I., Jago, J.R., Rouet, L., Stebner, A. and Schäfer, J.F. 2025. Semiautomatic three-dimensional ultrasound renal volume segmentation in pediatric hydronephrosis: interrater agreement and correlation to conventional hydronephrosis grading. Pediatr. Radiol. 55, 1298–1307.

Fages, J., Dunn, M., Specchi, S. and Pey, P. 2018. Ultrasound evaluation of the renal pelvis in cats with ureteral obstruction treated with a subcutaneous ureteral bypass: a retrospective study of 27 cases (2010-2015). J. Feline Med. Surg. 20, 875–883.

Gould, E.N., Cohen, T.A., Trivedi, S.R. and Kim, J.Y. 2016. Emphysematous pyelonephritis in a domestic shorthair cat. J. Feline Med. Surg. 18, 357–363.

Hoyong, J. 2016. Three-dimensional ultrasound volume assessment of pediatric kidneys: comparison with conventional two-dimensional ultrasound, M.S. thesis, Seoul National Univ., Seoul, South Korea.

Hyttel, P., Sinowatz, F. and Vejlsted, M. 2010. Essentials of domestic animal embryology. Edinburgh, United Kingdom: Elsevier.

Jakovljevic, S., Rivers, W.J., Chun, R., King, V.L. and Han, C.M. 1999. Results of renal ultrasonography performed before and during administration of saline (0.9% NaCl) solution to induce diuresis in dogs without evidence of renal disease. Am. J. Vet. Res. 60, 405–409.

Kennedy, A.J. and White, J.D. 2022. Feline ureteral obstruction: a case-control study of risk factors (2016-2019). J. Feline. Med. Surg. 24, 298–303.

Khandelwal, S., Dhande, R., Sood, A., Parihar, P. and Mishra, G.V. 2023. Role of multidetector computed tomography urography in the evaluation of obstructive uropathy: a review. Cureus. 15, e48038.

Kim, H.C., Yang, D.M., Jin, W. and Lee, S.H. 2010. Relation Between Total Renal Volume and Renal Function: usefulness of 3D Sonographic Measurements With a Matrix Array Transducer. Am. J. Roentgenol. 194, W186–W192.

Lamb, C.R., Cortellini, S. and Halfacree, Z. 2018. Ultrasonography in the diagnosis and management of cats with ureteral obstruction. J. Feline. Med. Surg. 20, 15–22.

Lemieux, C., Vachon, C., Beauchamp, G. and Dunn, M.E. 2021. Minimal renal pelvis dilation in cats diagnosed with benign ureteral obstruction by antegrade pyelography: a retrospective study of 82 cases (2012-2018). J. Feline Med. Surg. 23, 892–899.

Lin, L., Bagley, D.H. and Liu, J.B. 2008. Role of endoluminal sonography in evaluation of obstruction of the ureteropelvic junction. Am. J. Roentgenol. 191, 1250–1254.

Livet, V., Pillard, P., Goy-Thollot, I., Maleca, D., Cabon, Q., Remy, D., Fau, D., Viguier., Pouzot, C., Carozzo, C. and Cachon, T. 2017. Placement of subcutaneous ureteral bypasses without fluoroscopic guidance in cats with ureteral obstruction: 19 cases (2014-2016). J. Feline Med. Surg. 19, 1030–1039.

Lulich, J.P., Berent, A.C., Adams, L.G., Westropp, J.L., Bartges, J.W. and Osborne, C.A. 2016. ACVIM Small Animal Consensus Recommendations on the Treatment and Prevention of Uroliths in Dogs and Cats. J. Vet. Intern. Med. 30, 1564–1574.

Nevins, J.R., Mai, W. and Thomas, E. 2015. Associations between ultrasound and clinical findings in 87 cats with urethral obstruction. Vet. Radiol. Ultrasound 56, 439–447.

Nguyen, H.T., Benson, C.B., Bromley, B., Campbell, J.B., Chow, J., Coleman, B., Cooper, C., Crino, J., Darge, K., Anthony Herndon, C.D., Odibo, A.O., Somers, M.J.G. and Stein, D.R. 2014. Multidisciplinary consensus on the classification of prenatal and postnatal urinary tract dilation (UTD classification system). J. Pediatr. Urol. 10, 982–998.

Onen, A. 2020. Grading of hydronephrosis: an ongoing challenge. Front. Pediatr. 8, 458.

Pereira, A.K., Reis, Z.S.N., Bouzada, M.C.F., De Oliveira, E.A., Osanan, G. and Cabral, A.C.V. 2011. Antenatal ultrasonographic anteroposterior renal pelvis diameter measurement: is it a reliable way of defining fetal hydronephrosis?. Obstet. Gynecol. Int. 2, 86.

Quimby, J.M., Dowers, K., Herndon, A.K. and Randall, E.K. 2017. Renal pelvic and ureteral ultrasonographic characteristics of cats with chronic kidney disease in comparison with normal cats, and cats with pyelonephritis or ureteral obstruction. J. Feline Med. Surg. 19, 784–790.

Raine-Fenning, N.J., Clewes, J.S., Kendall, N.R., Bunkheila, A.K., Campbell, B.K. and Johnson, I.R. 2003. The interobserver reliability and validity of volume calculation from three-dimensional ultrasound datasets in the in vitro setting. Ultrasound Obstet. Gynecol. 21, 283–291.

Riccabona, M., Avni, F.E., Blickman, J.G., Dacher, J.N., Darge, K., Lobo, M.L. and Willi, U. 2008. Imaging recommendations in paediatric uroradiology: minutes of the ESPR workgroup session on urinary tract infection, fetal hydronephrosis, urinary tract ultrasonography and voiding cystourethrography, Barcelona, Spain, June 2007. Pediatr. Radiol. 38, 138–145.

Riccabona, M., Fritz, G. and Ring, E. 2003. Potential applications of three-dimensional ultrasound in the pediatric urinary tract: pictorial demonstration based on preliminary results. Eur. Radiol. 13, 2680–2687.

Riccabona, M., Nelson, T.R. and Pretorius, D.H. 1996. Three-dimensional ultrasound: accuracy of distance and volume measurements. Ultrasound Obstet. Gynecol. 7, 429–434.

Stringer, M.D. and Yassaie, S. 2013. Is the pelviureteric junction an anatomical entity?. J. Pediatr. Urol. 9, 123–128.

Strømmen, K., Stormark, T.A., Iversen, B.M. and Matre, K. 2004. Volume estimation of small phantoms and rat kidneys using three-dimensional ultrasonography and a position sensor. Ultrasound. Med. Biol. 30, 1109–1117.

Suson, K.D. and Preece, J. 2020. Do current scientific reports of hydronephrosis make the grade?. J. Pediatr. Urol. 16, 597.e1–597.e6.

Taylor, A.J., Lara-Garcia, A. and Benigni, L. 2014. Ultrasonographic characteristics of canine renal lymphoma. Vet. Radiol. Ultrasound. 55, 441–446.

Yoshizaki, C.T., Francisco, R.P.V., De Pinho, J.C., Ruano, R. and Zugaib, M. 2013. Renal volumes measured by 3-dimensional sonography in healthy fetuses from 20 to 40 weeks. J. Ultrasound. Med. 32, 421–427.

Zhu, W., Xiong, S., Xu, C., Zhu, Z., Li, Z., Zhang, L., Guan, H., Huang, Y., Zhang, P., Zhu, H., Lin, J., Li, X. and Zhou, L. 2021. Initial experiences with preoperative three-dimensional image reconstruction technology in laparoscopic pyeloplasty for ureteropelvic junction obstruction. Transl. Androl. Urol. 10, 4142–4151.


Supplementary Material

Supplementary 1. The set of instructions provided to observers to dictate how they made assessments of the renal pelvis images.

Supplementary 2. Examples of the ‘Transverse crest-to-ureter’ (TCU) (A), ‘Transverse pelvic width’ (TPW) (B), and ‘Transverse Dorsoventral’ (TDV) (C) measurements performed by observers.

Supplementary 3. Examples of the ‟Sagittal pelvic width” (SPW) (A), ‟Sagittal pelvic length (SPL)” (B), and ‟Sagittal Parenchymal Percentage” (SPP) (C) measurements performed by observers.

Supplementary 4. Examples of the ‟Dorsal crest-to-ureter” (DCU) (A) and ‟Dorsal pelvic width (DPW)” (B) measurements performed by observers.

Supplementary 5. Flow chart outlining the method by which images were excluded after assessment.



How to Cite this Article
Pubmed Style

White M, Adezio A, Flatz K, Hoffmann K, Kennedy A, Lam R, Lau M, Siow JW, White J. Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. Open Vet. J.. 2025; 15(12): 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52


Web Style

White M, Adezio A, Flatz K, Hoffmann K, Kennedy A, Lam R, Lau M, Siow JW, White J. Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. https://www.openveterinaryjournal.com/?mno=287719 [Access: January 25, 2026]. doi:10.5455/OVJ.2025.v15.i12.52


AMA (American Medical Association) Style

White M, Adezio A, Flatz K, Hoffmann K, Kennedy A, Lam R, Lau M, Siow JW, White J. Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. Open Vet. J.. 2025; 15(12): 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52



Vancouver/ICMJE Style

White M, Adezio A, Flatz K, Hoffmann K, Kennedy A, Lam R, Lau M, Siow JW, White J. Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. Open Vet. J.. (2025), [cited January 25, 2026]; 15(12): 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52



Harvard Style

White, M., Adezio, . A., Flatz, . K., Hoffmann, . K., Kennedy, . A., Lam, . R., Lau, . M., Siow, . J. W. & White, . J. (2025) Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. Open Vet. J., 15 (12), 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52



Turabian Style

White, Mason, Andrew Adezio, Katharina Flatz, Karon Hoffmann, Alexandra Kennedy, Richard Lam, Michelle Lau, Jia Wen Siow, and Joanna White. 2025. Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. Open Veterinary Journal, 15 (12), 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52



Chicago Style

White, Mason, Andrew Adezio, Katharina Flatz, Karon Hoffmann, Alexandra Kennedy, Richard Lam, Michelle Lau, Jia Wen Siow, and Joanna White. "Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction." Open Veterinary Journal 15 (2025), 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52



MLA (The Modern Language Association) Style

White, Mason, Andrew Adezio, Katharina Flatz, Karon Hoffmann, Alexandra Kennedy, Richard Lam, Michelle Lau, Jia Wen Siow, and Joanna White. "Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction." Open Veterinary Journal 15.12 (2025), 6721-6737. Print. doi:10.5455/OVJ.2025.v15.i12.52



APA (American Psychological Association) Style

White, M., Adezio, . A., Flatz, . K., Hoffmann, . K., Kennedy, . A., Lam, . R., Lau, . M., Siow, . J. W. & White, . J. (2025) Interobserver agreement of retrospective renal pelvis measurements in cats with confirmed ureteral obstruction. Open Veterinary Journal, 15 (12), 6721-6737. doi:10.5455/OVJ.2025.v15.i12.52