Difference between revisions of "Automated Essay Scoring"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
Line 15: Line 15:


* Revised automated scoring engine for assessing GSE essay
* Revised automated scoring engine for assessing GSE essay
*E-Rater gave African American students  lower scores than human raters did
*Relative weakness in content and organization by African American test takers resulted in lower scores than Chinese peers who wrote longer.
Wang et al. (2018) [[https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]]
Wang et al. (2018) [[https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]]
*Automated scoring model for evaluating English spoken responses
*Automated scoring model for evaluating English spoken responses
*SpeechRater gave a significantly lower score than human raters for German
*SpeechRater gave a significantly lower score than human raters for German
*SpeechRater scored in favor of Chinese group, with H1-rater scores higher than mean
*SpeechRater scored in favor of Chinese group, with H1-rater scores higher than mean

Revision as of 05:16, 24 January 2022

Bridgeman, Trapani, and Attali (2009) [pdf]

  • E-Rater system that automatically grades a student’s essay
  • Essays written by Hispanic and Asian-American students over-graded than those by White and African American peers.
  • inaccurately give Chinese and Korean students significantly higher scores than human essay raters on a test of foreign language proficiency
  • Correlate more poorly and bias upwards in terms of GRE essay scores for Chinese students,

Bridgeman, Trapani, and Attali (2012) [pdf]

  • A later version of E-Rater system for automatic grading of GSE essay
  • Model gave lower scores to African American students than human-raters
  • Chinese students are given higher scores than human essay raters
  • Speakers of Arabic and Hindi were given lower scores

Ramineni & Williamson (2018) [pdf]

  • Revised automated scoring engine for assessing GSE essay
  • Relative weakness in content and organization by African American test takers resulted in lower scores than Chinese peers who wrote longer.

Wang et al. (2018) [pdf]

  • Automated scoring model for evaluating English spoken responses
  • SpeechRater gave a significantly lower score than human raters for German
  • SpeechRater scored in favor of Chinese group, with H1-rater scores higher than mean