Difference between revisions of "National Origin or National Location"
Jump to navigation
Jump to search
Line 17: | Line 17: | ||
* Model predicting student achievement on the standardized examination PISA | * Model predicting student achievement on the standardized examination PISA | ||
* Inaccuracy of the U.S.-trained model was greater for students from countries with lower scores of national development (e.g. Indonesia, Vietnam, Moldova) | * Inaccuracy of the U.S.-trained model was greater for students from countries with lower scores of national development (e.g. Indonesia, Vietnam, Moldova) | ||
Wang et al. (2018) [[https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]] | |||
* Automated scoring model for evaluating English spoken responses | |||
* SpeechRater gave a significantly lower score than human raters for German | |||
* SpeechRater scored in favor of Chinese group, with H1-rater scores higher than mean |
Revision as of 04:34, 24 January 2022
Bridgeman, Trapani, and Attali (2009) [pdf]
- E-Rater system that automatically grades a student’s essay
- Inaccurately high scores were given to Chinese and Korean students
- System showed poor correlation for GRE essay scores of Chinese students
Bridgeman, Trapani, and Attali (2012) [pdf]
- A later version of E-Rater system for automatic grading of GSE essay
- Chinese students were given higher scores than when graded by human essay raters
- Speakers of Arabic and Hindi were given lower scores
Ogan and colleagues (2015) [pdf]
- Multi-national model predicting learning gains from student's help-seeking behavior
- Both U.S. and combined model performed extremely poorly for Costa Rica
- U.S. model outperformed for Philippines than when trained with its own data set
Li et al. (2021) [pdf]
- Model predicting student achievement on the standardized examination PISA
- Inaccuracy of the U.S.-trained model was greater for students from countries with lower scores of national development (e.g. Indonesia, Vietnam, Moldova)
Wang et al. (2018) [pdf]
- Automated scoring model for evaluating English spoken responses
- SpeechRater gave a significantly lower score than human raters for German
- SpeechRater scored in favor of Chinese group, with H1-rater scores higher than mean