Difference between revisions of "Black/African-American Learners in North America"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
m
 
(53 intermediate revisions by 4 users not shown)
Line 3: Line 3:
* J48 decision trees achieved much lower Kappa and AUC for Black students than White students
* J48 decision trees achieved much lower Kappa and AUC for Black students than White students
* JRip decision rules achieved almost identical Kappa and AUC for Black students and White students
* JRip decision rules achieved almost identical Kappa and AUC for Black students and White students


Hu and Rangwala (2020) [https://files.eric.ed.gov/fulltext/ED608050.pdf pdf]
Hu and Rangwala (2020) [https://files.eric.ed.gov/fulltext/ED608050.pdf pdf]
* Models predicting if student at-risk of failing a course
* Models predicting if a college student will fail in a course
* Several algorithms perform worse for African-American students
* Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
* The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science
 
 
Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf]
* Models predicting student's high school dropout
* The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and  Native Hawaiian and Pacific Islander.
 
 
Lee and Kizilcec (2020) [https://arxiv.org/pdf/2007.00088.pdf pdf]
* Models predicting college success (or median grade or above)
* Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
* The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values
 
 
Yu et al. (2020) [https://files.eric.ed.gov/fulltext/ED608066.pdf pdf]
* Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
* Black students were inaccurately predicted to perform worse for both short-term and long-term
* The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model
 
 
Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf]
* Models predicting college dropout for students in residential and fully online program
* Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
* The model showed better recall for URM students, whether they were in residential or online program
 
 
Ramineni & Williamson (2018) [https://files.eric.ed.gov/fulltext/EJ1202928.pdf pdf]
* Revised automated scoring engine for assessing GRE essay
* E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
* The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization
 
 
 
Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring pdf]
* Automated scoring models for evaluating English essays, or e-rater
* The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students
 
 
 
Bridgeman et al. (2012) [https://www.tandfonline.com/doi/pdf/10.1080/08957347.2012.635502 pdf]
* A later version of automated scoring models for evaluating English essays, or e-rater
* E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE
 
 
Jiang & Pardos (2021) [https://dl.acm.org/doi/pdf/10.1145/3461702.3462623 pdf]
* Predicting university course grades using LSTM
* Roughly equal accuracy across racial groups
* Slightly better accuracy (~1%) across racial groups when including race in model
 
 
Zhang et al. (2022) [https://www.upenn.edu/learninganalytics/ryanbaker/EDM22_paper_35.pdf pdf]
* Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
* For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
* No racial/ethnic group consistently had best-performing detectors
 
 
Li, Xing, & Leite (2022) [https://dl.acm.org/doi/pdf/10.1145/3506860.3506869?casa_token=OZmlaKB9XacAAAAA:2Bm5XYi8wh4riSmEigbHW_1bWJg0zeYqcGHkvfXyrrx_h1YUdnsLE2qOoj4aQRRBrE4VZjPrGw pdf]
* Models predicting whether two students will communicate on an online discussion forum
* Compared members of overrepresented racial groups to members of underrepresented racial groups (over 2/3
Black/African American)
* Multiple fairness approaches lead to ABROCA of under 0.01 for overrepresented versus underrepresented students
 
 
Litman et al. (2021) [https://link.springer.com/chapter/10.1007/978-3-030-78292-4_21 html]
* Automated essay scoring models inferring text evidence usage
* All algorithms studied have less than 1% of error explained by whether student is Black
 
 
Jeong et al. (2022) [https://fated2022.github.io/assets/pdf/FATED-2022_paper_Jeong_Racial_Bias_ML_Algs.pdf]
* Predicting 9th grade math score from academic performance, surveys, and demographic information
* Despite comparable accuracy, model tends to underpredict Black students' performance
* Several fairness correction methods equalize false positive and false negative rates across groups.
 


Anderson et al. (2019) [https://www.upenn.edu/learninganalytics/ryanbaker/EDM2019_paper56.pdf pdf]
Zhang et al.(2023) [https://learninganalytics.upenn.edu/ryanbaker/ISLS23_annotation%20detector_short_submit.pdf pdf]
* Models predicting six-year college graduation
* Models developed to detect attributes of student feedback for other students’ mathematics solutions, reflecting the presence of three constructs:1) commenting on process, 2) commenting on the answer, and 3) relating to self.
* Performance for African-American students comparable to performance for students in other races.
* Models have approximately equal performance for African American, Hispanic/Latinx, and White students.

Latest revision as of 20:01, 28 June 2023

Kai et al. (2017) pdf

  • Models predicting student retention in an online college program
  • J48 decision trees achieved much lower Kappa and AUC for Black students than White students
  • JRip decision rules achieved almost identical Kappa and AUC for Black students and White students


Hu and Rangwala (2020) pdf

  • Models predicting if a college student will fail in a course
  • Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
  • The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science


Christie et al. (2019) pdf

  • Models predicting student's high school dropout
  • The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.


Lee and Kizilcec (2020) pdf

  • Models predicting college success (or median grade or above)
  • Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
  • The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values


Yu et al. (2020) pdf

  • Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
  • Black students were inaccurately predicted to perform worse for both short-term and long-term
  • The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model


Yu et al. (2021) pdf

  • Models predicting college dropout for students in residential and fully online program
  • Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
  • The model showed better recall for URM students, whether they were in residential or online program


Ramineni & Williamson (2018) pdf

  • Revised automated scoring engine for assessing GRE essay
  • E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
  • The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization


Bridgeman et al. (2009) pdf

  • Automated scoring models for evaluating English essays, or e-rater
  • The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students


Bridgeman et al. (2012) pdf

  • A later version of automated scoring models for evaluating English essays, or e-rater
  • E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE


Jiang & Pardos (2021) pdf

  • Predicting university course grades using LSTM
  • Roughly equal accuracy across racial groups
  • Slightly better accuracy (~1%) across racial groups when including race in model


Zhang et al. (2022) pdf

  • Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
  • For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
  • No racial/ethnic group consistently had best-performing detectors


Li, Xing, & Leite (2022) pdf

  • Models predicting whether two students will communicate on an online discussion forum
  • Compared members of overrepresented racial groups to members of underrepresented racial groups (over 2/3

Black/African American)

  • Multiple fairness approaches lead to ABROCA of under 0.01 for overrepresented versus underrepresented students


Litman et al. (2021) html

  • Automated essay scoring models inferring text evidence usage
  • All algorithms studied have less than 1% of error explained by whether student is Black


Jeong et al. (2022) [1]

  • Predicting 9th grade math score from academic performance, surveys, and demographic information
  • Despite comparable accuracy, model tends to underpredict Black students' performance
  • Several fairness correction methods equalize false positive and false negative rates across groups.


Zhang et al.(2023) pdf

  • Models developed to detect attributes of student feedback for other students’ mathematics solutions, reflecting the presence of three constructs:1) commenting on process, 2) commenting on the answer, and 3) relating to self.
  • Models have approximately equal performance for African American, Hispanic/Latinx, and White students.