UMR 8135 CNRS - INaLCO

ANR contracts

NaijaSynCor – A Corpus-Based Macro-Syntactic Study of Naija (Nigerian Pidgin) (01.02.2017-31.07.2020)

PI : Bernard Caron

NaijaSynCor takes an exhaustive and in-depth look at the structure of Naija (Nigerian Pidgin) in Nigeria today. Spoken by educated Nigerians, it has been proved to develop in Lagos as a discrete language, separate from Nigerian English.

This study proposes to assess whether this holds true for the rest of Nigeria where Naija is spoken by over 75 million speakers. It examines diachronic, diatopic, diaphasic, diastratic, and genre variation.

The project is a collaborative effort of two Nigerian leading experts on Naija (F. Egbokhare & C. Ofulue) and two research units that have proven their expertise in corpus annotation in previous programs: LLACAN, on lesser-described languages; MoDyCo, on the interaction of prosody and syntax in French and the development of large treebanks.

More information...


BULB (BULB - Breaking the Unwritten Language Barrier) (01.03.2015 - 28.02.2018)

PI : Gilles Adda & Sebastian Stücker

In a context where a growing number of languages are in danger of extinction and linguists in dire need for efficient language documentation tools, Breaking the Unwritten Language Barrier (BULB) aims at supporting the documentation of unwritten languages with the help of modern natural language processing technologies, in particular automatic speech recognition (ASR) and machine translation (MT).

This ANR/DFG project relies on a strong German-French cooperation between linguists and computer scientists from ZAS (F. Hamlaoui), the KIT (S. Stücker) and the University of Stuttgart (S. Zerbian) on the German side, as well as the LPP (M. Adda-Decker, A. Rialland), the LLACAN (M. Van de Velde, D. Idiatov), the LIMSI (L. Lamel and F. Yvon), the LIG (L. Besacier) and the IMMI-CNRS (G. Adda) on the French side. These researchers and their local teams are bringing together their expertise to address the documentation of three mostly unwritten and generally under-resourced African languages of the Bantu family: Basaa

More information...


An ecopoetic approach to the literature of the Senegal River Valley (01.01.2018 – 31.12.2021)

PI : Mélanie Bourlet

EcoSen aims to highlight the links between Pulaar literary output and the environment of the Senegal River region (southern Mauritania, northern Senegal), by implementing a novel theoretical approach to the literature of this region, and, more widely, to Afrophone literature, which we term Ecopoetics.

This discipline, which is just beginning to emerge in French research, investigates in texts the relationship between humans and their environment, in order to shed light on environmental poetics, which accepts that there are many ways of inhabiting the world, and more or less implicitly assumes the importance of a socio-political dimension.

The underlying hypothesis of EcoSen is that the environment plays a leading role in the creation of poetry and in how it resonates, both with the local population and in the diaspora.
This hypothesis is based on numerous findings in the previous research of the project members, all of whom have in common extensive field experience in the Senegal River valley.

The Senegal River flows through the entire region in question and its seasonal flooding impacts the entire life of the population. However, since the 1960s, the region has been subject to profound changes, both environmental (droughts, dam construction, the extinction of animal and plant species) and social (increased poverty, mass migration both within and beyond Africa).

This transformation of the countryside has been accompanied by a growing sense of political marginalization and an outspoken voicing of cultural demands supported by some quite dynamic national and international associations, largely articulated by poets. Thus, from the 1960s onwards, contemporary Pulaar literature has been developing in both Africa and Europe.

Simultaneously, the rich, prolific and prestigious oral poetry of the valley, traditionally presented as dependent on a hierarchical social organization, remains deeply rooted in the natural environment, while at the same time evolving in terms of how it is proclaimed, disseminated and honored culturally (e.g. through the Internet and art festivals).

In order to identify the link between poetry and the environment, EcoSen will adopt an interdisciplinary methodology bringing together five specialists from this area of the Sahel (two literature specialists, one geographer, and two anthropologists). Their aim is to undertake collaborative research on the processes, functions and issues surrounding the creation of places and landscapes, starting with a corpus of oral and written poetry representing different social groups (shepherds, hunters, breeders, etc.). New data will be gathered on the field (poetry, cartographic data, linguistic terminology, ethnographic interviews etc).

Their cross-fertilization will lead to the setting up of a digital platform developed by a team of skilled technicians in collaboration with the researchers, which will serve the interests not only of the disciplines involved but also the general public. EcoSen’s approach and results will be reflected in publications, and through collaborations in academia (in Senegal and France) and civil society.

EcoSen has the support of the UNESCO initiative Rivers and Heritage: the Natural and Cultural Diversity of River Landscapes. By bequeathing the hitherto subsidiary topic of the environment its rightful place as an aesthetic element essential to understanding the dynamism and social implications of Pulaar poetry, EcoSen proposes a new and stimulating change of perspective on the poetry of the Senegal River valley. This will enable us to grasp, from a new angle, the message of the texts already collected and those to come, and to understand the mobilizing force of the poetic discourse observed in this region. The very originality of EcoSen will advance French thinking about Ecopoetics, by allowing young researchers to apply this theory to the African context.

Les parlers du Croissant : une approche multidisciplinaire du contact oc-oïl (01.01.2018-31.12.2021) :

PI : Nicolas Quint

The zone of the Croissant corresponds to the northern edge of the Massif Central. The Gallo-Roman dialects that are traditionally spoken there (and whose speakers are now usually in their seventies or older) display typical features of the Oc-varieties as well as of the Oïl varieties. The project is multi-disciplinary study of the Croissant dialects:

  1. Constitution of a multi-dialectal corpus;
  2. Comparison of the Croissant dialects among each other;
  3. Fieldwork and language description of individual varities (phonetics, prosody, morphology, semantics);
  4. Characterisation of the Croissant varieties in relation to neighbouring Oc-/ Oïl-varities;
  5. Psycholinguistic studies of the bilingualism French-local variety;
  6. Sociolinguistic study.

The project aims at a documentation of a seriously endangered, little-known linguistic heritage. It ensures the preservation of the collected data (which will be made available online), produces scientific publications on selected linguistic aspects and guarantees the return of the findings to the speaker communities.


View finished ANR projects


Finished ANR contracts

ELLAF (Encyclopédie des Littératures en Langues africaines) (01/2014-01/2017)

PI : Ursula Baumgardt

The little known literature in African languages is rich and encompasses oral literature as well as literature written in different scripts. The great linguistic and formal variety of African literature raises important analytical questions as well as questions relating to the theory of literature in general: What is the link between the status of a language and its capacity to produce literary texts? What are the relations between oral literature and literary writing?

In order to improve the state of documentation, it is essential to develop documentary tools; they are the necessary requisites for the realisation of interdisciplinary research that goes beyond the limits of one literature type. ELLAF has the ambition to be both a database of literature in African languages – irrespective of their sociolinguistic status – and a research platform. According to a shared protocol, excerpts or full versions of literary texts are presented with a translation in French and/or English on the ELLAF web site. Each text is contextualised, information on the circumstances of its creation or performance is given and thus the literary genre to which it belongs is defined.

More information...(in French)


CorTypo (03/2013-03/2017)

PI : Amina Mettouchi

The aim of the CorTypo project is the elaboration of an innovative system of linguistic annotation of natural language corpora in lesser-described spoken languages, in view of testing linguistic hypotheses on spontaneous discourse data, in a typological perspective.

In order to achieve this goal a number of fundamental theoretical questions need to be resolved with respect to language form and language functions. Crucially, the project addresses the question of what kind of theoretical apparatus is required for the comparison of languages displaying different formal means and different functions.

By implementing theoretical solutions into corpus-design and database-design, the project provides the basis for the empirical testing and falsification of hypotheses, and allows the elaboration of new hypotheses on language structure and cross-linguistic comparison. By proposing solutions to the problem of linguistic interoperability, it paves the way for large-scale typological work based on first-hand natural language data.

More information...


RefLex (12/2010 – 05/2015)

PI : Guillaume Segerer

The RefLex project aims at providing the scientific community with (i) a lexical reference corpus of the languages of Africa as well as (ii) the instruments to process and analyse the data of this corpus. A more detailed description (in French) is available as a PDF document.

More information...


Sénélangues (10/2009 – 01/2014)

PI : Stéphane Robert

The project wants to contribute to the documentation and description of the languages of Senegal and to the classification of the Atlantic languages.

The identification of the least documented and/or most endangered languages will allow us to define research priorities.

The description of these languages will advance our knowledge of the language of Senegal considerably and help take steps to safeguard those languages that are endangered. The project intends to make an Africanist contribution to linguistic typology as well as to language classification; it will provide valuable argument for the revision of the contested classification of the Atlantic language family.

More information...(in French)


CORPAFROAS (2007-2012)

PI : Amina Mettouchi

CORPAFROAS was a project financed by the ANR (France) from 2007-2012.
It was an integrated pilot project realised by field linguists for field linguists and typologists which proposed a methodology for the treatment of fieldwork textual data in little known languages, from data gathering to automatic searches on the corpus.

It developed free open-source and user-friendly new software, ELAN-CorpA, on the basis of ELAN (Max Planck Institute Nijmegen).

It made available a corpus of time-aligned annotated first-hand transcriptions of narrative and conversational data from different Afroasiatic languages, with accompanying sound files, list of glosses, grammatical sketches, and metadata.
The corpus is freely accessible online together with software, instruments and publications that aim at facilitating contributions by other field linguists to CORPAFROAS and at inspiring initiatives modelled on the CORPAFROAS project.

More information...