Difference between revisions of "Speech Recognition for Education"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
(Added Entry)
 
(correction)
 
(7 intermediate revisions by one other user not shown)
Line 1: Line 1:
Bridgeman, Trapani, and Attali (2012) [pdf]
Wang et al. (2018) [https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]
*Automated scoring model for evaluating English spoken responses
*SpeechRater gave a significantly lower score than human raters for German students
*SpeechRater gave higher scores to students from China than human raters, with H1-rater scores higher than mean


* A later version of E-Rater system for automatic grading of GSE essay
 
* Model gave lower scores to speakers of Arabic and Hindi
  Loukina & Buzick (2017) [https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/ets2.12170 pdf]
*a model (the SpeechRater) automatically scoring open-ended spoken responses for speakers with documented or suspected speech impairments
*SpeechRater was less accurate for test takers who were deferred for signs of speech impairment (ρ<sup>2</sup> = .57) than test takers who were given accommodations for documented disabilities (ρ<sup>2</sup> = .73)
 
 
Loukina et al. (2019) [https://aclanthology.org/W19-4401.pdf pdf]
*Models providing automated speech scores on English language proficiency assessment
*L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
*All models (Baseline, Fair feature subset, L1-specific) performed worse for Japanese speakers

Latest revision as of 05:09, 10 June 2022

Wang et al. (2018) pdf

  • Automated scoring model for evaluating English spoken responses
  • SpeechRater gave a significantly lower score than human raters for German students
  • SpeechRater gave higher scores to students from China than human raters, with H1-rater scores higher than mean


  Loukina & Buzick (2017) pdf

  • a model (the SpeechRater) automatically scoring open-ended spoken responses for speakers with documented or suspected speech impairments
  • SpeechRater was less accurate for test takers who were deferred for signs of speech impairment (ρ2 = .57) than test takers who were given accommodations for documented disabilities (ρ2 = .73)


Loukina et al. (2019) pdf

  • Models providing automated speech scores on English language proficiency assessment
  • L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
  • All models (Baseline, Fair feature subset, L1-specific) performed worse for Japanese speakers