Difference between revisions of "Native Language and Dialect"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
(93)
(Added Sha et al (2021))
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Naismith et al. (2018) [[http://d-scholarship.pitt.edu/40665/1/EDM2018_paper_37.pdf pdf]]
Naismith et al. (2018) [http://d-scholarship.pitt.edu/40665/1/EDM2018_paper_37.pdf pdf]
* a model that measures L2 learners’ lexical sophistication with the frequency list based on the native speaker corpora
* Model that measures L2 learners’ lexical sophistication with the frequency list based on the native speaker corpora
* Arabic-speaking learners are rated systematically lower across all levels of English proficiency than speakers of Chinese, Japanese, Korean, and Spanish.
* Arabic-speaking learners are rated systematically lower across all levels of human-assessed English proficiency than speakers of Chinese, Japanese, Korean, and Spanish
* Level 5 Arabic-speaking learners are unfairly evaluated to have similar level of lexical sophistication as Level 4 learners from China, Japan, Korean and Spain .
* Level 5 Arabic-speaking learners are inaccurately evaluated to have similar level of lexical sophistication as Level 4 learners from China, Japan, Korean and Spain
* When used on ETS corpus, “high”-labeled essays by Japanese-speaking learners are rated significantly lower in lexical sophistication than Arabic, Japanese, Korean and Spanish peers.
* When used on the ETS corpus, essays by Japanese-speaking learners with higher human-rated lexical sophistication are rated significantly lower in lexical sophistication than Arabic, Japanese, Korean and Spanish peers


*




Loukina et al. (2019) [[https://aclanthology.org/W19-4401.pdf pdf]]
Loukina et al. (2019) [https://aclanthology.org/W19-4401.pdf pdf]


* Models providing automated speech scores on English language proficiency assessment
* Models providing automated speech scores on English language proficiency assessment
* L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
* L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
* All models (Baseline, Fair feature subset, L1-specific) performed disadvantageously for Japanese speakers
* All models (Baseline, Fair feature subset, L1-specific) performed worse for Japanese speakers
 
 
Rzepka et al. (2022) [https://www.insticc.org/node/TechnicalProgram/CSEDU/2022/presentationDetails/109621 pdf]
* Models predicting whether student will quit spelling learning activity without completing
* Multiple algorithms have slightly better false positive rates for second-language speakers than native speakers, but equivalent performance on multiple other metrics.
 
 
Sha et al. (2021) [https://angusglchen.github.io/files/AIED2021_Lele_Assessing.pdf pdf]
* Models predicting a MOOC discussion forum post is content-relevant or content-irrelevant
* MOOCs taught in English
* ABROCA varied from 0.03 to 0.08 for non-native speakers of English versus native speakers
* Balancing the size of each group in the training set reduced ABROCA

Latest revision as of 11:01, 4 July 2022

Naismith et al. (2018) pdf

  • Model that measures L2 learners’ lexical sophistication with the frequency list based on the native speaker corpora
  • Arabic-speaking learners are rated systematically lower across all levels of human-assessed English proficiency than speakers of Chinese, Japanese, Korean, and Spanish
  • Level 5 Arabic-speaking learners are inaccurately evaluated to have similar level of lexical sophistication as Level 4 learners from China, Japan, Korean and Spain
  • When used on the ETS corpus, essays by Japanese-speaking learners with higher human-rated lexical sophistication are rated significantly lower in lexical sophistication than Arabic, Japanese, Korean and Spanish peers


Loukina et al. (2019) pdf

  • Models providing automated speech scores on English language proficiency assessment
  • L1-specific model trained on the speaker’s native language was the least fair, especially for Chinese, Japanese, and Korean speakers, but not for German speakers
  • All models (Baseline, Fair feature subset, L1-specific) performed worse for Japanese speakers


Rzepka et al. (2022) pdf

  • Models predicting whether student will quit spelling learning activity without completing
  • Multiple algorithms have slightly better false positive rates for second-language speakers than native speakers, but equivalent performance on multiple other metrics.


Sha et al. (2021) pdf

  • Models predicting a MOOC discussion forum post is content-relevant or content-irrelevant
  • MOOCs taught in English
  • ABROCA varied from 0.03 to 0.08 for non-native speakers of English versus native speakers
  • Balancing the size of each group in the training set reduced ABROCA