Difference between revisions of "Black/African-American Learners in North America"
Jump to navigation
Jump to search
m (re-ordering of examples) |
|||
Line 13: | Line 13: | ||
Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf] | Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf] | ||
* Models predicting student's high school dropout | * Models predicting student's high school dropout | ||
* The decision trees showed little difference in AUC among White | * The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander. | ||
Lee and Kizilcec (2020) [https://arxiv.org/pdf/2007.00088.pdf pdf] | Lee and Kizilcec (2020) [https://arxiv.org/pdf/2007.00088.pdf pdf] | ||
* Models predicting college success (or median grade or above) | * Models predicting college success (or median grade or above) | ||
* Random forest algorithms performed significantly worse for underrepresented minority students (URM; American Indian | * Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian) | ||
* The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values | * The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values | ||
Line 42: | Line 42: | ||
Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring pdf] | Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring pdf] | ||
* Automated scoring models for evaluating English essays, or e-rater | * Automated scoring models for evaluating English essays, or e-rater | ||
* The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by | * The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students | ||
Revision as of 04:59, 10 June 2022
Kai et al. (2017) pdf
- Models predicting student retention in an online college program
- J48 decision trees achieved much lower Kappa and AUC for Black students than White students
- JRip decision rules achieved almost identical Kappa and AUC for Black students and White students
Hu and Rangwala (2020) pdf
- Models predicting if a college student will fail in a course
- Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
- The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science
Christie et al. (2019) pdf
- Models predicting student's high school dropout
- The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.
Lee and Kizilcec (2020) pdf
- Models predicting college success (or median grade or above)
- Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
- The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values
Yu et al. (2020) pdf
- Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
- Black students were inaccurately predicted to perform worse for both short-term and long-term
- The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model
Yu et al. (2021) pdf
- Models predicting college dropout for students in residential and fully online program
- Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
- The model showed better recall for URM students, whether they were in residential or online program
Ramineni & Williamson (2018) pdf
- Revised automated scoring engine for assessing GRE essay
- E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
- The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization
Bridgeman et al. (2009) pdf
- Automated scoring models for evaluating English essays, or e-rater
- The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students
Bridgeman et al. (2012) pdf
- A later version of automated scoring models for evaluating English essays, or e-rater
- E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE
Jiang & Pardos (2021) pdf
- Predicting university course grades using LSTM
- Roughly equal accuracy across racial groups
- Slightly better accuracy (~1%) across racial groups when including race in model
Zhang et al. (in press)
- Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
- For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
- No racial/ethnic group consistently had best-performing detectors