Simulation for Assessment of Milestones in Emergency Medicine Residents

Publication Date


Journal Title

Acad Emerg Med


© 2017 by the Society for Academic Emergency Medicine Objectives: All residency programs in the United States are required to report their residents' progress on the milestones to the Accreditation Council for Graduate Medical Education (ACGME) biannually. Since the development and institution of this competency-based assessment framework, residency programs have been attempting to ascertain the best ways to assess resident performance on these metrics. Simulation was recommended by the ACGME as one method of assessment for many of the milestone subcompetencies. We developed three simulation scenarios with scenario-specific milestone-based assessment tools. We aimed to gather validity evidence for this tool. Methods: We conducted a prospective observational study to investigate the validity evidence for three mannequin-based simulation scenarios for assessing individual residents on emergency medicine (EM) milestones. The subcompetencies (i.e., patient care [PC]1, PC2, PC3) included were identified via a modified Delphi technique using a group of experienced EM simulationists. The scenario-specific checklist (CL) items were designed based on the individual milestone items within each EM subcompetency chosen for assessment and reviewed by experienced EM simulationists. Two independent live raters who were EM faculty at the respective study sites scored each scenario following brief rater training. The inter-rater reliability (IRR) of the assessment tool was determined by measuring intraclass correlation coefficient (ICC) for the sum of the CL items as well as the global rating scales (GRSs) for each scenario. Comparing GRS and CL scores between various postgraduate year (PGY) levels was performed with analysis of variance. Results: Eight subcompetencies were chosen to assess with three simulation cases, using 118 subjects. Evidence of test content, internal structure, response process, and relations with other variables were found. The ICCs for the sum of the CL items and the GRSs were >0.8 for all cases, with one exception (clinical management GRS = 0.74 in sepsis case). The sum of CL items and GRSs (p < 0.05) discriminated between PGY levels on all cases. However, when the specific CL items were mapped back to milestones in various proficiency levels, the milestones in the higher proficiency levels (level 3 [L3] and 4 [L4]) did not often discriminate between various PGY levels. L3 milestone items discriminated between PGY levels on five of 12 occasions they were assessed, and L4 items discriminated only two of 12 times they were assessed. Conclusion: Three simulation cases with scenario-specific assessment tools allowed evaluation of EM residents on proficiency L1 to L4 within eight of the EM milestone subcompetencies. Evidence of test content, internal structure, response process, and relations with other variables were found. Good to excellent IRR and the ability to discriminate between various PGY levels was found for both the sum of CL items and the GRSs. However, there was a lack of a positive relationship between advancing PGY level and the completion of higher-level milestone items (L3 and L4).

Volume Number


Issue Number



205 - 220

Document Type





School of Medicine

Primary Department

Emergency Medicine





For the public and Northwell Health campuses