Abstract
This study addresses the lack of an objective and efficient method to measure exam difficulty. We leverage a Large Language Model to evaluate the difficulty of Chinese College Entrance Exams from 1999 to 2003, focusing on mathematics as it is a core subject, universally applicable, and serves as a key differentiator due to its varying difficulty levels. The validity of this method is confirmed through question order, expert evaluations, and student performance scores. Our findings reveal that exam difficulty does significantly affect the mismatch between universities and students. This highlights the critical role of exam difficulty in shaping educational policy and influencing student decision-making, ultimately affecting the efficiency and fairness of the university matching process.