Difference between revisions of "At-risk/Dropout/Stopout/Graduation Prediction"
Jump to navigation
Jump to search
(25 intermediate revisions by 4 users not shown) | |||
Line 2: | Line 2: | ||
* Models predicting student retention in an online college program | * Models predicting student retention in an online college program | ||
* J48 decision trees achieved much lower Kappa and AUC for Black students than White students | * J48 decision trees achieved much lower Kappa and AUC for Black students than White students | ||
* J48 decision trees achieved significantly lower Kappa but higher AUC for male students than female students | |||
* JRip decision rules achieved almost identical Kappa and AUC for Black students and White students | * JRip decision rules achieved almost identical Kappa and AUC for Black students and White students | ||
* | * JRip decision trees achieved much lower Kappa and AUC for male students than female students | ||
* | |||
Hu and Rangwala (2020) [https://files.eric.ed.gov/fulltext/ED608050.pdf pdf] | |||
* Models predicting if a college student will fail in a course | |||
* Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse | |||
* The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science | |||
* Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against male students, performing particularly better for Psychology course. | |||
* Other models (Logistic Regression and Rawlsian Fairness) performed far worse for male students, performing particularly worse in Computer Science and Electrical Engineering. | |||
Line 11: | Line 19: | ||
* False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used | * False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used | ||
* White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD | * White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD | ||
* False negatives rates were greater for male students than female students when SVM, Logistic Regression, and SGD were used | |||
Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf] | |||
* Models predicting student's high school dropout | |||
* The decision trees showed little difference in AUC among White, Black, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander. | |||
* The decision trees showed very minor differences in AUC between female and male students | |||
Gardner, Brooks and Baker (2019) [[https://www.upenn.edu/learninganalytics/ryanbaker/LAK_PAPER97_CAMERA.pdf pdf]] | Gardner, Brooks and Baker (2019) [[https://www.upenn.edu/learninganalytics/ryanbaker/LAK_PAPER97_CAMERA.pdf pdf]] | ||
* Model predicting MOOC dropout, specifically through slicing analysis | * Model predicting MOOC dropout, specifically through slicing analysis | ||
* Some algorithms performed worse for female students than male students, particularly in courses with 45% or less male presence | * Some algorithms performed worse for female students than male students, particularly in courses with 45% or less male presence | ||
Baker et al. (2020) [[https://www.upenn.edu/learninganalytics/ryanbaker/BakerBerningGowda.pdf pdf]] | Baker et al. (2020) [[https://www.upenn.edu/learninganalytics/ryanbaker/BakerBerningGowda.pdf pdf]] | ||
Line 24: | Line 37: | ||
* For prediction of graduation, algorithms applying across population resulted an AUC of 0.60, degrading from their original performance of 70% or 71% to chance. | * For prediction of graduation, algorithms applying across population resulted an AUC of 0.60, degrading from their original performance of 70% or 71% to chance. | ||
* For prediction of SAT scores, algorithms applying across population resulted in a Spearman's ρ of 0.42 and 0.44, degrading a third from their original performance to chance. | * For prediction of SAT scores, algorithms applying across population resulted in a Spearman's ρ of 0.42 and 0.44, degrading a third from their original performance to chance. | ||
Kai et al. (2017) [https://files.eric.ed.gov/fulltext/ED596601.pdf pdf] | Kai et al. (2017) [https://files.eric.ed.gov/fulltext/ED596601.pdf pdf] | ||
* Models predicting student retention in an online college program | * Models predicting student retention in an online college program | ||
* | * J-48 decision trees achieved much higher Kappa and AUC for students whose parents did not attend college than those whose parents did | ||
* J-Rip decision rules achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did | * J-Rip decision rules achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did | ||
Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf] | Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf] | ||
* Models predicting college dropout | * Models predicting college dropout for students in residential and fully online program | ||
* | * The model showed better recall for students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs | ||
* | * Whether the socio-demographic information was included or not, the model showed worse accuracy and true negative rates for residential students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs | ||
* Model | * Both accuracy and true negative rates were better for students who are first-generation, or with greater financial needs | ||
* Model | |||
Verdugo et al. (2022) [https://dl.acm.org/doi/abs/10.1145/3506860.3506902 pdf] | |||
* An algorithm predicting dropout from university after the first year | |||
* Several algorithms achieved better AUC and F1 for students who attended public high schools than for students who attended private high schools. | |||
* Several algorithms predicted better AUC for male students than female students; F1 scores were more balanced. | |||
Sha et al. (2022) [https://ieeexplore.ieee.org/abstract/document/9849852] | |||
* Predicting dropout in XuetangX platform using neural network | |||
* A range of over-sampling methods tested | |||
* Regardless of over-sampling method used, dropout performance was slightly better for males. | |||
Queiroga et al. (2022) [https://www.mdpi.com/2078-2489/13/9/401 pdf] | |||
* Models predicting secondary school students at risk of failure or dropping out | |||
* Model was unable to make prediction of student success (F1 score = 0.0) for students not in a social welfare program (higher socioeconomic status) | |||
* Model had slightly lower AUC ROC (0.52 instead of 0.56) for students not in a social welfare program (higher socioeconomic status) | |||
Permodo et al.(2023) [https://www.researchgate.net/publication/370001437_Difficult_Lessons_on_Social_Prediction_from_Wisconsin_Public_Schools pdf] | |||
* Paper discusses system that predicts probabilities of on-time graduation | |||
*Prediction is less accurate for White students than other students | |||
*Prediction is more accurate for students with Disabilities than students without Disabilities | |||
*Prediction is more accurate for low-income students than for non-low-income students | |||
*Prediction is comparable for Males and Females | |||
Cock et al.(2023) [[https://dl.acm.org/doi/abs/10.1145/3576050.3576149?casa_token=6Fjh-EUzN-gAAAAA%3AtpRMYzSAVoQFYNzwY5gwSsrnzHIlI0tUjMq6okwgdcCUmuBMVZEtn8eLO52dCtIYUbrHBV_Il9Sx pdf]] | |||
* Paper investigates biases in models designed to early identify middle school students at risk of failing in flipped-classroom course and open-ended exploration environment (TugLet) | |||
* Model performs worse for students from school with higher socio-economic status in open-ended environment (FNR 0.73 for higher SES and 0.57 for medium SES). | |||
* Model performs worse for males in open-ended environment (higher FNR for males than females) | |||
* Model performs worse for students with diploma from foreign country in flipped classroom | |||
* Model performs worse for females in flipped classrooms |
Latest revision as of 22:54, 27 November 2023
Kai et al. (2017) pdf
- Models predicting student retention in an online college program
- J48 decision trees achieved much lower Kappa and AUC for Black students than White students
- J48 decision trees achieved significantly lower Kappa but higher AUC for male students than female students
- JRip decision rules achieved almost identical Kappa and AUC for Black students and White students
- JRip decision trees achieved much lower Kappa and AUC for male students than female students
Hu and Rangwala (2020) pdf
- Models predicting if a college student will fail in a course
- Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
- The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science
- Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against male students, performing particularly better for Psychology course.
- Other models (Logistic Regression and Rawlsian Fairness) performed far worse for male students, performing particularly worse in Computer Science and Electrical Engineering.
Anderson et al. (2019) pdf
- Models predicting six-year college graduation
- False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used
- White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD
- False negatives rates were greater for male students than female students when SVM, Logistic Regression, and SGD were used
Christie et al. (2019) pdf
- Models predicting student's high school dropout
- The decision trees showed little difference in AUC among White, Black, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.
- The decision trees showed very minor differences in AUC between female and male students
Gardner, Brooks and Baker (2019) [pdf]
- Model predicting MOOC dropout, specifically through slicing analysis
- Some algorithms performed worse for female students than male students, particularly in courses with 45% or less male presence
Baker et al. (2020) [pdf]
- Model predicting student graduation and SAT scores for military-connected students
- For prediction of graduation, algorithms applying across population resulted an AUC of 0.60, degrading from their original performance of 70% or 71% to chance.
- For prediction of SAT scores, algorithms applying across population resulted in a Spearman's ρ of 0.42 and 0.44, degrading a third from their original performance to chance.
Kai et al. (2017) pdf
- Models predicting student retention in an online college program
- J-48 decision trees achieved much higher Kappa and AUC for students whose parents did not attend college than those whose parents did
- J-Rip decision rules achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did
Yu et al. (2021) pdf
- Models predicting college dropout for students in residential and fully online program
- The model showed better recall for students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
- Whether the socio-demographic information was included or not, the model showed worse accuracy and true negative rates for residential students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
- Both accuracy and true negative rates were better for students who are first-generation, or with greater financial needs
Verdugo et al. (2022) pdf
- An algorithm predicting dropout from university after the first year
- Several algorithms achieved better AUC and F1 for students who attended public high schools than for students who attended private high schools.
- Several algorithms predicted better AUC for male students than female students; F1 scores were more balanced.
Sha et al. (2022) [1]
- Predicting dropout in XuetangX platform using neural network
- A range of over-sampling methods tested
- Regardless of over-sampling method used, dropout performance was slightly better for males.
Queiroga et al. (2022) pdf
- Models predicting secondary school students at risk of failure or dropping out
- Model was unable to make prediction of student success (F1 score = 0.0) for students not in a social welfare program (higher socioeconomic status)
- Model had slightly lower AUC ROC (0.52 instead of 0.56) for students not in a social welfare program (higher socioeconomic status)
Permodo et al.(2023) pdf
- Paper discusses system that predicts probabilities of on-time graduation
- Prediction is less accurate for White students than other students
- Prediction is more accurate for students with Disabilities than students without Disabilities
- Prediction is more accurate for low-income students than for non-low-income students
- Prediction is comparable for Males and Females
Cock et al.(2023) [pdf]
- Paper investigates biases in models designed to early identify middle school students at risk of failing in flipped-classroom course and open-ended exploration environment (TugLet)
- Model performs worse for students from school with higher socio-economic status in open-ended environment (FNR 0.73 for higher SES and 0.57 for medium SES).
- Model performs worse for males in open-ended environment (higher FNR for males than females)
- Model performs worse for students with diploma from foreign country in flipped classroom
- Model performs worse for females in flipped classrooms