Difference between revisions of "Gender: Male/Female"
Jump to navigation
Jump to search
Line 42: | Line 42: | ||
* Whether the protected attributed were included or not, the models had worse true negative rates but better recall for male students | * Whether the protected attributed were included or not, the models had worse true negative rates but better recall for male students | ||
* The model was worse for male students studying in online program in terms of true negative rates, recall and accuracy. | * The model was worse for male students studying in online program in terms of true negative rates, recall and accuracy. | ||
Riazy et al. (2020) [[pdf](https://www.scitepress.org/Papers/2020/93241/93241.pdf)] | |||
* Models predicting course outcome of students in a virtual learning environment (VLE) | |||
* More male students were predicted to pass the course than female students, but by slight difference and this overestimation was not consistent across different algorithms | |||
* Among the algorithms, Naive Bayes had the lowest normalized mutual information value and the highest ABROCA value, or differences between the area under curve |
Revision as of 21:15, 22 March 2022
Kai et al. (2017) pdf
- Models predicting student retention in an online college program
- J48 decision trees achieved significantly lower Kappa but higher AUC for male students than female students
- JRip decision rules achieved much lower Kappa and AUC for male students than female students
Hu and Rangwala (2020) pdf
- Models predicting if a college student will fail in a course
- Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against male students, performing particularly better for Psychology course.
- Other models (Logistic Regression and Rawlsian Fairness) performed far worse for male students, performing particularly worse in Computer Science and Electrical Engineering.
Anderson et al. (2019) pdf
- Models predicting six-year college graduation
- False negatives rates were greater for male students than female students when SVM, Logistic Regression, and SGD were used
Gardner, Brooks and Baker (2019) [pdf]
- Model predicting MOOC dropout, specifically through slicing analysis
- Some algorithms studied performed worse for female students than male students, particularly in courses with 45% or less male presence
Riazy et al. (2020) [pdf]
- Model predicting course outcome
- Fairly marginal differences were found for prediction quality and in overall proportion of predicted pass between groups
- Inconsistent in direction between algorithms.
Lee and Kizilcec (2020) [pdf]
- Models predicting college success (or median grade or above)
- Random forest algorithms performed significantly worse for male students than female students
- The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values
Yu et al. (2020) [pdf]
- Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
- Female students were inaccurately predicted to achieve greater short-term and long-term success than male students.
- The fairness of models improved when a combination of institutional and click data was used in the model
Yu and colleagues (2021) [pdf]
- Models predicting college dropout for students in residential and fully online program
- Whether the protected attributed were included or not, the models had worse true negative rates but better recall for male students
- The model was worse for male students studying in online program in terms of true negative rates, recall and accuracy.
Riazy et al. (2020) [[pdf](https://www.scitepress.org/Papers/2020/93241/93241.pdf)]
- Models predicting course outcome of students in a virtual learning environment (VLE)
- More male students were predicted to pass the course than female students, but by slight difference and this overestimation was not consistent across different algorithms
- Among the algorithms, Naive Bayes had the lowest normalized mutual information value and the highest ABROCA value, or differences between the area under curve