Inter-examiner reliability of the Doha agreement meeting classification system of groin pain in male athletes
The upshot
How reliably do two experienced clinicians agree when classifying groin pain in athletes using the Doha agreement meeting classification system?
Inter-examiner reliability of the Doha classification system ranges from slight to substantial depending on the clinical entity, with perfect agreement only when athletes present with a single, unilateral groin pain entity. Reliability is lower for complex cases involving bilateral symptoms or multiple entities.
DescriptiveRead paper
Primary study48 ParticipantsLimited evidence
Key points
- Kappa values ranged from slight (k=0.13 for 'other causes') to substantial (k=0.62 for hip-related groin pain) across the six clinical entities
- Perfect 100% agreement was achieved for the 7 athletes with unilateral, single-entity groin pain
- Ranking entities by perceived clinical importance improved reliability for adductor-, inguinal-, and iliopsoas-related pain but not for pubic-related or 'other causes'
- Subgroup analysis using strict diagnostic criteria showed higher kappa values (0.23 to 0.73, fair to substantial) compared to the primary analysis (0.12 to 0.57)
- Both examiners were Doha agreement panel experts, so findings may overestimate reliability achievable by less experienced clinicians
How it was conducted
- Design
- Prospective inter-examiner reliability study
- Participants
- 48 male athletes (66 symptomatic sides) with longstanding groin pain, recruited at Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar
- Examiners
- Two blinded clinicians: a general surgeon (23 years experience) and a physiotherapist (10 years experience), both Doha agreement panel members
- Classification tool
- Doha agreement meeting classification system (adductor-, inguinal-, iliopsoas-, pubic-, hip-related groin pain, and other causes)
- Primary outcome
- Inter-examiner reliability measured by Cohen's Kappa with 95% CI
- Analysis
- Dichotomous (present/absent) and ordinal (ranked by clinical importance) kappa; subgroup analysis using strict Doha criteria
What they found
- Adductor-related groin pain: kappa 0.40 (95% CI 0.16-0.64), fair; weighted kappa 0.52-0.65 range after ranking
- Inguinal-related groin pain: kappa 0.44 (95% CI not fully reconstructed from text), moderate on dichotomous scale
- Iliopsoas-related groin pain: kappa 0.57 (95% CI 0.37-0.77), moderate on dichotomous scale
- Pubic-related groin pain: kappa 0.12 (95% CI -0.23 to 0.46), slight
- Hip-related groin pain: kappa 0.62 (95% CI 0.30-0.95), substantial
- Other causes of groin pain: kappa 0.13 (95% CI -0.35 to 0.61), slight
- Overall exact agreement on same classification combination: 14/48 participants (29%) and 15/66 sides (23%)
- Unilateral single-entity cases: 100% agreement (7/7 participants)
- Subgroup analysis (strict criteria): kappa 0.48 for inguinal-, 0.49 for adductor-, 0.73 for iliopsoas-, 0.23 for pubic-related groin pain
Limitations
- Both examiners were experienced Doha agreement panel experts, limiting generalizability to clinicians with less groin-pain specialization
- Tertiary referral setting inflated complexity (bilateral pain, multiple entities), which may have reduced reliability compared to primary care
- Study population was exclusively male athletes, so findings cannot be assumed to apply to female athletes
- Small sample for low-prevalence entities (hip-related 11%, other causes 10%), increasing kappa uncertainty
Why it matters
- For patients
- Athletes with groin pain should know that the label assigned to their condition can differ between clinicians, particularly if pain is present on both sides or involves more than one structure.
- For clinicians
- When applying the Doha classification, reliability is acceptable for single-entity presentations but clinicians should be cautious interpreting multi-entity or bilateral cases; using strict Doha criteria and ranking by clinical importance improves agreement.
- For readers
- This is the first study to formally test how consistently the widely used Doha groin classification can be applied, revealing important gaps that need to be addressed before the system can support reliable clinical decision-making or research.
Source
doi:10.1111/sms.14248
Read the original paperClinically assessing this area? See the hip & groin special tests.
More Hip & Groin studies
- Hip strengthening exercise compared to standard rehabilitation after revision hip replacement: a multicentre RCTRCT
- Landing stability during the single-leg drop jump in footballers with hip and/or groin painPrimary study
- Reliability of tests assessing hamstring function during hip extension and their associations with maximal sprinting speed: a cross-sectional studyCross-sectional
- Heavy slow resistance training combined with patient education in patients with gluteal tendinopathy: a feasibility studyPrimary study
- Clinical and cost-effectiveness of a cycling and education intervention versus usual physiotherapy care for hip osteoarthritis (CLEAT): a pragmatic RCTRCT
- Effects of a weighted vs unweighted low-dose isometric Copenhagen adduction exercise programme on hip adduction and abduction strength: an RCT in senior-level playersRCT