PhysioHub

Inter-examiner reliability of the Doha agreement meeting classification system of groin pain in male athletes

The upshot

How reliably do two experienced clinicians agree when classifying groin pain in athletes using the Doha agreement meeting classification system?

Inter-examiner reliability of the Doha classification system ranges from slight to substantial depending on the clinical entity, with perfect agreement only when athletes present with a single, unilateral groin pain entity. Reliability is lower for complex cases involving bilateral symptoms or multiple entities.

DescriptiveRead paper
Primary study48 ParticipantsLimited evidence

Key points

  1. Kappa values ranged from slight (k=0.13 for 'other causes') to substantial (k=0.62 for hip-related groin pain) across the six clinical entities
  2. Perfect 100% agreement was achieved for the 7 athletes with unilateral, single-entity groin pain
  3. Ranking entities by perceived clinical importance improved reliability for adductor-, inguinal-, and iliopsoas-related pain but not for pubic-related or 'other causes'
  4. Subgroup analysis using strict diagnostic criteria showed higher kappa values (0.23 to 0.73, fair to substantial) compared to the primary analysis (0.12 to 0.57)
  5. Both examiners were Doha agreement panel experts, so findings may overestimate reliability achievable by less experienced clinicians

How it was conducted

Design
Prospective inter-examiner reliability study
Participants
48 male athletes (66 symptomatic sides) with longstanding groin pain, recruited at Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar
Examiners
Two blinded clinicians: a general surgeon (23 years experience) and a physiotherapist (10 years experience), both Doha agreement panel members
Classification tool
Doha agreement meeting classification system (adductor-, inguinal-, iliopsoas-, pubic-, hip-related groin pain, and other causes)
Primary outcome
Inter-examiner reliability measured by Cohen's Kappa with 95% CI
Analysis
Dichotomous (present/absent) and ordinal (ranked by clinical importance) kappa; subgroup analysis using strict Doha criteria

What they found

  • Adductor-related groin pain: kappa 0.40 (95% CI 0.16-0.64), fair; weighted kappa 0.52-0.65 range after ranking
  • Inguinal-related groin pain: kappa 0.44 (95% CI not fully reconstructed from text), moderate on dichotomous scale
  • Iliopsoas-related groin pain: kappa 0.57 (95% CI 0.37-0.77), moderate on dichotomous scale
  • Pubic-related groin pain: kappa 0.12 (95% CI -0.23 to 0.46), slight
  • Hip-related groin pain: kappa 0.62 (95% CI 0.30-0.95), substantial
  • Other causes of groin pain: kappa 0.13 (95% CI -0.35 to 0.61), slight
  • Overall exact agreement on same classification combination: 14/48 participants (29%) and 15/66 sides (23%)
  • Unilateral single-entity cases: 100% agreement (7/7 participants)
  • Subgroup analysis (strict criteria): kappa 0.48 for inguinal-, 0.49 for adductor-, 0.73 for iliopsoas-, 0.23 for pubic-related groin pain

Limitations

  • Both examiners were experienced Doha agreement panel experts, limiting generalizability to clinicians with less groin-pain specialization
  • Tertiary referral setting inflated complexity (bilateral pain, multiple entities), which may have reduced reliability compared to primary care
  • Study population was exclusively male athletes, so findings cannot be assumed to apply to female athletes
  • Small sample for low-prevalence entities (hip-related 11%, other causes 10%), increasing kappa uncertainty

Why it matters

For patients
Athletes with groin pain should know that the label assigned to their condition can differ between clinicians, particularly if pain is present on both sides or involves more than one structure.
For clinicians
When applying the Doha classification, reliability is acceptable for single-entity presentations but clinicians should be cautious interpreting multi-entity or bilateral cases; using strict Doha criteria and ranking by clinical importance improves agreement.
For readers
This is the first study to formally test how consistently the widely used Doha groin classification can be applied, revealing important gaps that need to be addressed before the system can support reliable clinical decision-making or research.

Source

doi:10.1111/sms.14248

Read the original paper
Clinically assessing this area? See the hip & groin special tests.

More Hip & Groin studies