Summer School: Learner Corpus Research – Theory and practical applications

The faculty of Linguistics and Literary Studies at the University of Bremen, Germany, is pleased to announce that it will host a summer school on Learner Corpus Research in August 2018, organised under the aegis of the Learner Corpus Association.

The aim of the event is to introduce researchers into the field of Learner Corpus Research through a series of overview lectures and hands-on sessions. The summer school is targeted at both young researchers, e.g. PhD students who have recently embarked on a learner corpus project, but also more experienced researchers from neighbouring fields such as corpus linguistics, SLA or LTA who want to know more about this dynamic, interdisciplinary field of research.

The following topics will be covered:
– an overview of the field of LCR and its resources
– learner corpus methodology and annotation
– statistics for the analysis of learner corpus data
– combining learner corpora with other data types in SLA research
– a selection of elective modules (consisting of a lecture and hands-on session) depending on the needs and interests of the participants

The classes are taught by an international team of leading experts in the field. Participants will also have the opportunity to give a brief presentation on their project and meet one of the teachers to discuss their projects individually.


– registration fee: EUR 55
– no tuition fees
– small bursary for international PhD students towards their travel expenses

Online registration opens on Friday, January 26th, at noon. For practical reasons, the number of participants is restricted to 22 people. In order to register, download our application form, fill it in and sent it by e-mail to Mrs. Reinhilt Schultze at

Best wishes
Marcus Callies

2018 Duolingo Shared Task on Second Language Acquisition Modeling (SLAM)

Duolingo invites research teams to participate in the first SLA Modeling (SLAM) Shared Task, in conjunction with the 13th BEA Workshop and the NAACL-HLT 2018 conference. You can access the detailed task description at: .

The goal of this task is to predict future mistakes that learners of English, Spanish, and French will make, based on a history of mistakes they have made in the past. The data set contains more than 2 million tokens (words) from exercises submitted by 6,000+ students over the course of their first 30 days using Duolingo (

New and interesting research opportunities in this task:

– There are three tracks for learners of (1) English, (2) Spanish, and (3) French. Teams are encouraged to explore features which generalize across all three languages.
– Anonymized learner IDs and time data will be provided. This allows teams to explore various personalized, adaptive SLA modeling approaches.
– The sequential nature of the data also allows teams to model language learning (and forgetting!) over time.

Training and development data, baseline code, and evaluation scripts are now ready and available for the task. Test data will be release in February 2018, with final evaluations taking place in March. For more details, please consult the task website.

Shared Task Website:

Shared Task Discussion Group:!forum/sla-modeling

Important Dates:

Jan 10, 2018 – Data release (phase 1): TRAIN and DEV sets
Feb 19, 2018 – Data release (phase 2): blind TEST set
Mar 19, 2018 – Final predictions deadline
Mar 21, 2018 – Final results announcement
Mar 28, 2018 – Draft system papers due
Apr 16, 2018 – Camera-ready system papers due
Jun 05, 2018 – Workshop at NAACL-HLT in New Orleans!

Task Organizers:

Burr Settles (Duolingo), Chris Brust (Duolingo), Erin Gustafson (Duolingo), Masato Hagiwara (Duolingo), Bozena Pajak (Duolingo), Joseph Rollinson (Duolingo), Hideki Shima (Duolingo), Nitin Madnani (ETS)

Best regards,
SLAM Shared Task Organizers

Special issue of the IJLCR on Segmental, prosodic and fluency features in phonetic learner corpora

Table of Contents

Jürgen Trouvain, Frank Zimmerer, Bernd Möbius, Mária Gósy and Anne Bonneau
105 – 117
Malte Belz, Simon Sauer, Anke Lüdeling and Christine Mooshammer
118 – 148
Mária Gósy, Dorottya Gyarmathy and András Beke
149 – 174
María Luisa García Lecumberri, Martin Cooke, Mirjam Wester, Martin Cooke and Mirjam Wester
175 – 195
Ulrike Gut
196 – 222
Sylvain Detey and Isabelle Racine
223 – 249
Oliver Niebuhr, Maria Alm, Nathalie Schümchen and Kerstin Fischer
250 – 277
Erratum Vol 3, Issue 1
List of reviewers
Referees for Volume 3 (2017)
279 – 280

PhD fellowship in Corpus Linguistics and Second Language Acquisition (L2 French)

The Centre for English Corpus Linguistics has an opening for a PhD fellowship for a total period of four years, starting February – June 2018.

The position is part of the FNRS-funded research project entitled Lexicogrammatical complexity in French as a Foreign Language: the impact of mode. The project is supervised by Magali Paquot (UCLouvain), with Alex Housen (VUB) as co-supervisor.

The project is part of a larger research programme that aims to define and circumscribe the linguistic construct of lexicogrammatical complexity, i.e. the complexity that arises from the (native-like) preferred co-selection of syntactic structures and lexical items in language use, within the framework of usage-based theories of language, and to theoretically and empirically demonstrate its relevance for L2 complexity research, and more generally for theories of L2 use and development (see Paquot, 2017; in press).

The main objective of the PhD project is to investigate the impact of mode (speech vs. writing) on lexicogrammatical complexity, with a focus on L2 French performance data.

Job description:

The research project is a joint venture between the Centre for English Corpus Linguistics (CECL) at the UCLouvain and the Center of Linguistics at the VUB. The candidate will be affiliated to the Institut Langage et Communication (ILC, UCLouvain) and will also prepare a joint UCLouvain-VUB PhD in Linguistics.

Activities that the candidate will perform include:

  • develop and implement (i) theoretical concepts in line with the focus of the research project and (ii) appropriate methodological procedures for investigating these concepts;
  • conduct corpus-based analyses of L1 and L2 French writing and spoken samples;
  • interpret the results of the analyses and report on the project in conference presentations and academic publications;
  • by the end of the four-year term, submit and defend a PhD dissertation based on the project.

Requirements and profile:

  • Master degree in Linguistics, Applied Linguistics, Language & Literature, Natural Language Processing or in Language Learning and Teaching, with a master thesis on a topic relevant to the project (note: a degree in French Linguistics is an asset, not a requirement);
  • excellent record of BA and MA level study;
  • excellent command of French, very good command of English.
  • excellent and demonstrated analytic skills;
  • ability to work with common software packages (including MS Word, Excel and PowerPoint);
  • basic knowledge of corpus-linguistic techniques is a requirement
  • knowledge of statistics and statistical software is an asset;
  • programming skills in Perl or Python are also an asset;
  • excellent and demonstrated self-management skills, ability and willingness to work in a team;
  • willingness to live in or near Louvain-la-Neuve and to travel abroad (for short-term research stays and to attend international academic conferences).

Terms of employment:

  • the contract will initially be for one year, three times renewable, with a total of four years.
  • the candidate receives a doctoral fellowship grant (starting at approx. EUR 1868 net per month) and full medical insurance.
  • the position requires residence in Belgium, preferably in or near Louvain-la-Neuve
  • applicants from outside the EU are responsible for obtaining the necessary visa or permits, with the assistance of UCLouvain staff department.

Application Deadline: Review of applications will begin on 1 February 2018, and continue until the position is filled

Please include with your application:

  • a cover letter in English, in which you specify why you are interested in this position and how you meet the job requirements outlined above;
  • a curriculum vitae in English;
  • a concise academic statement in French, in which you outline your expectations about and plans for graduate study and career goals;
  • a copy of BA and MA diplomas and degrees;
  • a copy of your master thesis and academic publications (if applicable);
  • the names and full contact details of two academic referees.

Shortlisted candidates will be invited for an interview (in situ or via video conferencing) in the second half of February 2018.

Applications (as an email attachment) and inquiries should be addressed to:

Dr. Magali Paquot

Centre for English Corpus Linguistics

Université Catholique de Louvain



Paquot, M. (2017). The phraseological dimension in interlanguage complexity research. Special issue of Second Language Research on ‘Multiple approaches to L2 Complexity’ (guest editors: Alex Housen and Bastien De Clercq). 10.1177/0267658317694221

Paquot, M. (in press). Phraseological competence: a useful toolbox to delimitate CEFR levels in higher education? Insights from a study of EFL learners’ use of statistical collocations. Special issue of Language Assessment Quarterly on ‘Language tests for academic enrolment and the CEFR’.