Difference between revisions of "Speech Recognition for Education"

Latest revision as of 06:09, 10 June 2022

Wang et al. (2018) pdf

Automated scoring model for evaluating English spoken responses
SpeechRater gave a significantly lower score than human raters for German students
SpeechRater gave higher scores to students from China than human raters, with H1-rater scores higher than mean

Loukina & Buzick (2017) pdf

a model (the SpeechRater) automatically scoring open-ended spoken responses for speakers with documented or suspected speech impairments
SpeechRater was less accurate for test takers who were deferred for signs of speech impairment (ρ² = .57) than test takers who were given accommodations for documented disabilities (ρ² = .73)

Loukina et al. (2019) pdf

Models providing automated speech scores on English language proficiency assessment
L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
All models (Baseline, Fair feature subset, L1-specific) performed worse for Japanese speakers

@@ Line 1: / Line 1: @@
+Wang et al. (2018) [https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]
+*Automated scoring model for evaluating English spoken responses
+*SpeechRater gave a significantly lower score than human raters for German students
+*SpeechRater gave higher scores to students from China than human raters, with H1-rater scores higher than mean
+  Loukina & Buzick (2017) [https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/ets2.12170 pdf]
+*a model (the SpeechRater) automatically scoring open-ended spoken responses for speakers with documented or suspected speech impairments
+*SpeechRater was less accurate for test takers who were deferred for signs of speech impairment (ρ<sup>2</sup> = .57) than test takers who were given accommodations for documented disabilities (ρ<sup>2</sup> = .73)
+Loukina et al. (2019) [https://aclanthology.org/W19-4401.pdf pdf]
+*Models providing automated speech scores on English language proficiency assessment
+*L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
+*All models (Baseline, Fair feature subset, L1-specific) performed worse for Japanese speakers