Brief summary, from the abstract
In this cross-sectional study, AI chatbot–generated patient education on ACL injury was generally understandable and actionable with little sex, gender, ethnic, or socioeconomic bias, but it read at a difficult level, often missed key information, and tended to use a negative tone that could heighten patient fear.
- Four large language models generated 40 ACL education responses across 10 personas; reading level (Flesch-Kincaid Grade Level) ranged from 9.9 (SD=0.8) to 11.4 (SD=1.5), harder than recommended for general patients.
- 36 of 40 responses (90%) met the 70% PEMAT-P threshold for understandability and 27 (67.5%) for actionability.
- No statistically significant language differences were found between personas across the models (p>.05), but 37 of 40 responses (92.5%) carried a negative tone.
- This was a single cross-sectional analysis of 40 AI outputs, so findings describe these models' behavior rather than measure real patient outcomes.
Clinically assessing this area? See the knee special tests.