Difference between revisions of "Black/African-American Learners in North America"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
m
 
(47 intermediate revisions by 4 users not shown)
Line 11: Line 11:




Anderson et al. (2019) [https://www.upenn.edu/learninganalytics/ryanbaker/EDM2019_paper56.pdf pdf]
Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf]
* Models predicting six-year college graduation
* Models predicting student's high school dropout
* Performance for African-American students comparable to performance for students in other races.
* The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and  Native Hawaiian and Pacific Islander.




Christie et al. (2019)
Lee and Kizilcec (2020) [https://arxiv.org/pdf/2007.00088.pdf pdf]
* Models predicting college success (or median grade or above)
* Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
* The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values


Ramineni & Williamson (2018) [[https://onlinelibrary.wiley.com/doi/10.1002/ets2.12192 pdf]]
 
* Revised automated scoring engine for assessing GSE essay
Yu et al. (2020) [https://files.eric.ed.gov/fulltext/ED608066.pdf pdf]
* Relative weakness in content and organization by African American test takers resulted in lower scores than Chinese peers who wrote longer.
* Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
* Black students were inaccurately predicted to perform worse for both short-term and long-term
* The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model
 
 
Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf]
* Models predicting college dropout for students in residential and fully online program
* Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
* The model showed better recall for URM students, whether they were in residential or online program
 
 
Ramineni & Williamson (2018) [https://files.eric.ed.gov/fulltext/EJ1202928.pdf pdf]
* Revised automated scoring engine for assessing GRE essay
* E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
* The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization
 
 
 
Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring pdf]
* Automated scoring models for evaluating English essays, or e-rater
* The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students
 
 
 
Bridgeman et al. (2012) [https://www.tandfonline.com/doi/pdf/10.1080/08957347.2012.635502 pdf]
* A later version of automated scoring models for evaluating English essays, or e-rater
* E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE
 
 
Jiang & Pardos (2021) [https://dl.acm.org/doi/pdf/10.1145/3461702.3462623 pdf]
* Predicting university course grades using LSTM
* Roughly equal accuracy across racial groups
* Slightly better accuracy (~1%) across racial groups when including race in model
 
 
Zhang et al. (2022) [https://www.upenn.edu/learninganalytics/ryanbaker/EDM22_paper_35.pdf pdf]
* Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
* For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
* No racial/ethnic group consistently had best-performing detectors
 
 
Li, Xing, & Leite (2022) [https://dl.acm.org/doi/pdf/10.1145/3506860.3506869?casa_token=OZmlaKB9XacAAAAA:2Bm5XYi8wh4riSmEigbHW_1bWJg0zeYqcGHkvfXyrrx_h1YUdnsLE2qOoj4aQRRBrE4VZjPrGw pdf]
* Models predicting whether two students will communicate on an online discussion forum
* Compared members of overrepresented racial groups to members of underrepresented racial groups (over 2/3
Black/African American)
* Multiple fairness approaches lead to ABROCA of under 0.01 for overrepresented versus underrepresented students
 
 
Litman et al. (2021) [https://link.springer.com/chapter/10.1007/978-3-030-78292-4_21 html]
* Automated essay scoring models inferring text evidence usage
* All algorithms studied have less than 1% of error explained by whether student is Black
 
 
Jeong et al. (2022) [https://fated2022.github.io/assets/pdf/FATED-2022_paper_Jeong_Racial_Bias_ML_Algs.pdf]
* Predicting 9th grade math score from academic performance, surveys, and demographic information
* Despite comparable accuracy, model tends to underpredict Black students' performance
* Several fairness correction methods equalize false positive and false negative rates across groups.
 
 
Zhang et al.(2023) [https://learninganalytics.upenn.edu/ryanbaker/ISLS23_annotation%20detector_short_submit.pdf pdf]
* Models developed to detect attributes of student feedback for other students’ mathematics solutions, reflecting the presence of three constructs:1) commenting on process, 2) commenting on the answer, and 3) relating to self.
* Models have approximately equal performance for African American, Hispanic/Latinx, and White students.

Latest revision as of 20:01, 28 June 2023

Kai et al. (2017) pdf

  • Models predicting student retention in an online college program
  • J48 decision trees achieved much lower Kappa and AUC for Black students than White students
  • JRip decision rules achieved almost identical Kappa and AUC for Black students and White students


Hu and Rangwala (2020) pdf

  • Models predicting if a college student will fail in a course
  • Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
  • The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science


Christie et al. (2019) pdf

  • Models predicting student's high school dropout
  • The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.


Lee and Kizilcec (2020) pdf

  • Models predicting college success (or median grade or above)
  • Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
  • The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values


Yu et al. (2020) pdf

  • Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
  • Black students were inaccurately predicted to perform worse for both short-term and long-term
  • The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model


Yu et al. (2021) pdf

  • Models predicting college dropout for students in residential and fully online program
  • Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
  • The model showed better recall for URM students, whether they were in residential or online program


Ramineni & Williamson (2018) pdf

  • Revised automated scoring engine for assessing GRE essay
  • E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
  • The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization


Bridgeman et al. (2009) pdf

  • Automated scoring models for evaluating English essays, or e-rater
  • The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students


Bridgeman et al. (2012) pdf

  • A later version of automated scoring models for evaluating English essays, or e-rater
  • E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE


Jiang & Pardos (2021) pdf

  • Predicting university course grades using LSTM
  • Roughly equal accuracy across racial groups
  • Slightly better accuracy (~1%) across racial groups when including race in model


Zhang et al. (2022) pdf

  • Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
  • For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
  • No racial/ethnic group consistently had best-performing detectors


Li, Xing, & Leite (2022) pdf

  • Models predicting whether two students will communicate on an online discussion forum
  • Compared members of overrepresented racial groups to members of underrepresented racial groups (over 2/3

Black/African American)

  • Multiple fairness approaches lead to ABROCA of under 0.01 for overrepresented versus underrepresented students


Litman et al. (2021) html

  • Automated essay scoring models inferring text evidence usage
  • All algorithms studied have less than 1% of error explained by whether student is Black


Jeong et al. (2022) [1]

  • Predicting 9th grade math score from academic performance, surveys, and demographic information
  • Despite comparable accuracy, model tends to underpredict Black students' performance
  • Several fairness correction methods equalize false positive and false negative rates across groups.


Zhang et al.(2023) pdf

  • Models developed to detect attributes of student feedback for other students’ mathematics solutions, reflecting the presence of three constructs:1) commenting on process, 2) commenting on the answer, and 3) relating to self.
  • Models have approximately equal performance for African American, Hispanic/Latinx, and White students.