Duolingo invites research teams to participate in the first SLA Modeling (SLAM) Shared Task, in conjunction with the 13th BEA Workshop and the NAACL-HLT 2018 conference. You can access the detailed task description at: http://sharedtask.duolingo.com .
The goal of this task is to predict future mistakes that learners of English, Spanish, and French will make, based on a history of mistakes they have made in the past. The data set contains more than 2 million tokens (words) from exercises submitted by 6,000+ students over the course of their first 30 days using Duolingo (https://www.duolingo.com).
New and interesting research opportunities in this task:
- There are three tracks for learners of (1) English, (2) Spanish, and (3) French. Teams are encouraged to explore features which generalize across all three languages.
- Anonymized learner IDs and time data will be provided. This allows teams to explore various personalized, adaptive SLA modeling approaches.
- The sequential nature of the data also allows teams to model language learning (and forgetting!) over time.
Training and development data, baseline code, and evaluation scripts are now ready and available for the task. Test data will be release in February 2018, with final evaluations taking place in March. For more details, please consult the task website.
Shared Task Website:
Shared Task Discussion Group:
Jan 10, 2018 – Data release (phase 1): TRAIN and DEV sets
Feb 19, 2018 – Data release (phase 2): blind TEST set
Mar 19, 2018 – Final predictions deadline
Mar 21, 2018 – Final results announcement
Mar 28, 2018 – Draft system papers due
Apr 16, 2018 – Camera-ready system papers due
Jun 05, 2018 – Workshop at NAACL-HLT in New Orleans!
Burr Settles (Duolingo), Chris Brust (Duolingo), Erin Gustafson (Duolingo), Masato Hagiwara (Duolingo), Bozena Pajak (Duolingo), Joseph Rollinson (Duolingo), Hideki Shima (Duolingo), Nitin Madnani (ETS)
SLAM Shared Task Organizers