Difference between revisions of "National Origin or National Location"

Latest revision as of 20:13, 1 September 2024

Švábenský et al. (2024) pdf

Classification models for predicting grades (worse than an average grade, “unsuccessful”, or equal/better than an average grade, “successful”)
Investigating bias based on university students' regional background in the context of the Philippines
Demographic groups based on 1 of 5 locations from which students accessed online courses in Canvas
Bias evaluation using AUC, weighted F1-score, and MADD showed consistent results across all groups, no unfairness was observed

Li et al. (2021) pdf

Model predicting student achievement on the standardized examination PISA
Inaccuracy of the U.S.-trained model was greater for students from countries with lower scores of national development (e.g. Indonesia, Vietnam, Moldova)

Wang et al. (2018) pdf

Automated scoring model for evaluating English spoken responses
SpeechRater gave a significantly lower score than human raters for German students
SpeechRater scored gave higher scores than human raters for Chinese students, with H1-rater scores higher than mean

Ogan et al. (2015) pdf

Multi-national models predicting learning gains from student's help-seeking behavior
Models built on only U.S. or combined data sets performed extremely poorly for Costa Rica
Models performed better when built on and applied for the same country, except for Philippines where model built on that country which was outperformed slightly by model built on U.S. data

Bridgeman et al. (2012) pdf

A later version of automated scoring models for evaluating English essays, or e-rater
E-rater gave better scores for test-takers from Chinese speakers (Mainland China, Taiwan, Hong Kong) and Korean speakers when assessing TOEFL (independent prompt) essay
E-rater gave lower scores for Arabic, Hindi, and Spanish speakers when assessing their written responses to independent prompt in TOEFL

Bridgeman et al. (2009) page

Automated scoring models for evaluating English essays, or e-rater
E-Rater gave significantly better scores than human rater for TOEFL essays (independent task) written by speakers of Chinese and Korean
E-Rater correlated poorly with human rater and gave better scores than human rater for GRE essays (both issue and argument prompts) written by Chinese speakers

@@ Line 1: / Line 1: @@
 Švábenský et al. (2024) [https://educationaldatamining.org/edm2024/proceedings/2024.EDM-posters.82/2024.EDM-posters.82.pdf pdf]
-* Classification models for predicting grades (worse than an average grade, “unsuccessful”, or equal/better than an average grade, “successful”)
+*Classification models for predicting grades (worse than an average grade, “unsuccessful”, or equal/better than an average grade, “successful”)
-* Investigating bias based on university students' regional background in the context of the Philippines
+*Investigating bias based on university students' regional background in the context of the Philippines
-* Demographic groups based on 1 of 5 locations from which students accessed online courses in Canvas
+*Demographic groups based on 1 of 5 locations from which students accessed online courses in Canvas
-* Bias evaluation using AUC, weighted F1-score, and MADD showed consistent results across all groups, no unfairness was observed
+*Bias evaluation using AUC, weighted F1-score, and MADD showed consistent results across all groups, no unfairness was observed
-Ogan et al. (2015) [https://link.springer.com/content/pdf/10.1007/s40593-014-0034-8.pdf pdf]
+Li et al. (2021) [https://arxiv.org/pdf/2103.15212.pdf pdf]
-* Multi-national models predicting learning gains from student's help-seeking behavior
+*Model predicting student achievement on the standardized examination PISA
-* Models built on only U.S. or combined data sets performed extremely poorly for Costa Rica
+*Inaccuracy of the U.S.-trained model was greater for students from countries with lower scores of national development (e.g. Indonesia, Vietnam, Moldova)
-* Models performed better when built on and applied for the same country, except for Philippines where model built on that country which was outperformed slightly by model built on U.S. data
-Li et al. (2021) [https://arxiv.org/pdf/2103.15212.pdf pdf]
+Wang et al. (2018) [https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]
-* Model predicting student achievement on the standardized examination PISA
+*Automated scoring model for evaluating English spoken responses
-* Inaccuracy of the U.S.-trained model was greater for students from countries with lower scores of national development (e.g. Indonesia, Vietnam, Moldova)
+*SpeechRater gave a significantly lower score than human raters for German students
+*SpeechRater scored gave higher scores than human raters for Chinese students, with H1-rater scores higher than mean
-Wang et al. (2018) [https://www.researchgate.net/publication/336009443_Monitoring_the_performance_of_human_and_automated_scores_for_spoken_responses pdf]
+Ogan et al. (2015) [https://link.springer.com/content/pdf/10.1007/s40593-014-0034-8.pdf pdf]
-* Automated scoring model for evaluating English spoken responses
+*Multi-national models predicting learning gains from student's help-seeking behavior
-* SpeechRater gave a significantly lower score than human raters for German students
+*Models built on only U.S. or combined data sets performed extremely poorly for Costa Rica
-* SpeechRater scored gave higher scores than human raters for Chinese students, with H1-rater scores higher than mean
+*Models performed better when built on and applied for the same country, except for Philippines where model built on that country which was outperformed slightly by model built on U.S. data
-Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring page]
+Bridgeman et al. (2012) [https://www.tandfonline.com/doi/pdf/10.1080/08957347.2012.635502?needAccess=true pdf]
-* Automated scoring models for evaluating English essays, or e-rater
+*A later version of automated scoring models for evaluating English essays, or e-rater
+*E-rater gave  better scores for test-takers from Chinese speakers (Mainland China, Taiwan, Hong Kong) and Korean speakers when assessing TOEFL (independent prompt) essay
+*E-rater gave lower scores for Arabic, Hindi, and Spanish speakers when assessing their written responses to independent prompt in TOEFL
-* E-Rater gave significantly better scores than human rater for TOEFL essays (independent task) written by speakers of Chinese and Korean
-* E-Rater correlated poorly with human rater and gave better scores than human rater for GRE essays (both issue and argument prompts) written by Chinese speakers
+Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring page]
+*Automated scoring models for evaluating English essays, or e-rater
-Bridgeman et al. (2012) [https://www.tandfonline.com/doi/pdf/10.1080/08957347.2012.635502?needAccess=true pdf]
+*E-Rater gave significantly better scores than human rater for TOEFL essays (independent task) written by speakers of Chinese and Korean
+*E-Rater correlated poorly with human rater and gave better scores than human rater for GRE essays (both issue and argument prompts) written by Chinese speakers
-* A later version of automated scoring models for evaluating English essays, or e-rater
-* E-rater gave  better scores for test-takers from Chinese speakers (Mainland China, Taiwan, Hong Kong) and Korean speakers when assessing TOEFL (independent prompt) essay
-* E-rater gave lower scores for Arabic, Hindi, and Spanish speakers when assessing their written responses to independent prompt in TOEFL

Difference between revisions of "National Origin or National Location"

Latest revision as of 20:13, 1 September 2024

Navigation menu

Search