AP

Updates and Information About Verifying AP Score Standards

Recent advances in data collection and technology have enabled a more comprehensive approach to setting standards for AP Exams. Over the last three years, the Advanced Placement Program has applied this new approach, known as evidence-based standard setting (EBSS), to a range of AP subjects to determine appropriate score standards for students in a range of AP courses.

EBSS collects input from hundreds of experts and assembles fine-grained student performance data for analysis, enabling us to verify and set AP score standards with more robust data than ever before.

Conducting the EBSS process for a wide range of AP subjects has had two results: 

  1. Among AP subjects that have typically long had a ~60%–80% AP Exam “success rate” (a score of 3 or higher) the EBSS process confirmed and maintained those success rates, which did not change as a result of the EBSS process. Table 1 contains the list of AP subjects with typical AP success rates of ~60%–80%.
     
  2. Among AP subjects that have had atypical success rates lower than 60%, the EBSS process justified increasing those success rates to align to the 60%–80% success rate historically achieved in most other AP subjects. These 9 subjects where success rates increased as a result of EBSS are listed in Table 2. After these increases in success rates, AP standards remain more stringent than college grades.

It is important to note that AP sets standards that are significantly higher than the standards represented by colleges’ own grade distributions: colleges’ grades in Humanities courses are typically 85% Cs or better, and colleges’ grades in STEM courses are typically 75% Cs or better, whereas for most AP subjects, the evidence shows that 60-75% of AP exams should receive scores of 3 or higher in order to maintain the historical standards associated with AP scores.

More details on how the score-setting process has changed  

Prior to 2022, the AP Program utilized standard-setting panels to confirm or change AP scores every 5–10 years. These panels of 10–18 educators followed established and well-documented protocols and remain viable for many assessment programs today.

In 2022, the AP Program began using evidence-based standard setting to verify scoring standards. EBSS is especially well-suited for exams designed to measure academic content knowledge and skills, like AP Exams. Moreover, because the EBSS process is so heavily anchored in student performance data, AP scores are not tied to college grades, which have shifted over time.

Over the past decade, two key developments have enabled AP to use EBSS rather than smaller panels:

  1. Digital data collection technologies have emerged that have made this type of quick, efficient, large-scale data collection and analysis possible. EBSS uses this new technology to collect, organize, and analyze inputs from hundreds of teachers and faculty, rather than just the experience and perspectives of 10– 18 panelists.
     
  2. Beginning in fall 2019, the AP Program provided all AP teachers with a new digital library of AP course materials—titled AP Classroom—and an accompanying course and exam description binder. This material, for the first time in the AP Program’s history, established coherent units, topics, learning objectives, and skills for each AP course that explicitly defined the parameters for assessment. This enabled a more comprehensive collection of metadata to be applied to each exam question by linking each question (and, if applicable, question part) to the skills, content, and difficulty level it was designed to measure. As a result, more granular and targeted student performance data is available for analysts to utilize in determining student abilities at basic, moderate, and exceptional levels.

External review and validation

We rely on external experts like the American Council on Education (ACE) to independently review and verify AP processes. In their most recent report, ACE stated that the validity evidence for AP scores with success rates in the typical 60-80% range was “exceptionally strong.”

Briefing the AP community 

As we do each time we conduct a score verification process, we have shared the findings with thousands of college faculty and teachers through presentations, briefings, and memos at the AP Readings and via the AP teacher community. We have also shared information and gathered feedback from our governance committees and advisory groups that include college faculty, enrollment, and admissions leaders.

More details on the score verification process, including an in-depth example of how this methodology was applied to AP U.S. History Exam scores, can be found in this AP Program Brief.