Does What We Write Matter? Determining the Features of High- and Low-Quality Summative Written Comments of Students on the Internal Medicine Clerkship Using Pile-Sort and Consensus Analysis: A Mixed-Methods Study

Document Type


Publication Date


Publication Title

BioMed Central Medical Education


Geisel School of Medicine


Background: Written comments by medical student supervisors provide written foundation for grade narratives and deans' letters and play an important role in student's professional development. Written comments are widely used but little has been published about the quality of written comments. We hypothesized that medical students share an understanding of qualities inherent to a high-quality and a low-quality narrative comment and we aimed to determine the features that define high-and low-quality comments. Methods: Using the well-established anthropological pile-sort method, medical students sorted written comments into 'helpful' and 'unhelpful' piles, then were interviewed to determine how they evaluated comments. We used multidimensional scaling and cluster analysis to analyze data, revealing how written comments were sorted across student participants. We calculated the degree of shared knowledge to determine the level of internal validity in the data. We transcribed and coded data elicited during the structured interview to contextualize the student's answers. Length of comment was compared using one-way analysis of variance; valence and frequency comments were thought of as helpful were analyzed by chi-square. Results: Analysis of written comments revealed four distinct clusters. Cluster A comments reinforced good behaviors or gave constructive criticism for how changes could be made. Cluster B comments exhorted students to continue non-specific behaviors already exhibited. Cluster C comments used grading rubric terms without giving student-specific examples. Cluster D comments used sentence fragments lacking verbs and punctuation. Student data exhibited a strong fit to the consensus model, demonstrating that medical students share a robust model of attributes of helpful and unhelpful comments. There was no correlation between valence of comment and perceived helpfulness. Conclusions: Students find comments demonstrating knowledge of the student and providing specific examples of appropriate behavior to be reinforced or inappropriate behavior to be eliminated helpful, and comments that are non-actionable and non-specific to be least helpful. Our research and analysis allow us to make recommendations helpful for faculty development around written feedback.