DoReCo database published

The DoReCo database is online! A huge THANKS to all corpus creators and project members for the enormous efforts in building it over the past 3.5 years and for making the inauguration event last week a big success. The DoReCo project is ending soon, but we’ll keep this website online for a while to inform you about publications resulting from DoReCo and other news.

29 July 2022 DoReCo Inauguration Ceremony

After three exciting and intense years of corpus-building, we are thrilled to announce the upcoming inauguration of the complete DoReCo database, with fully processed data sets on all 50 (plus one!) DoReCo languages. To mark this occasion, we invite everyone to a public event on 29 July 2022, 3:30-5:30pm CEST, to be held at the ZAS in Berlin and online. We are looking forward to a keynote address by Evangelia Adamou and we are also very happy that many of the DoReCo corpus contributors will be present to introduce themselves and the languages they work on. The program is as follows:
• 3:00 PM Admission (in person or online Zoom room)
• 3:30 PM Welcome by Manfred Krifka, statements by DoReCo PIs and Postdocs
• 4:00 PM Keynote by Evangelia Adamou
• 4:30 PM DoReCo contributors introduce themselves
• 5:15 PM DoReCo database goes online
• 5:30 PM Reception
Online or on-site attendance is free, but registration (by 22 July) is required:
Direct link to registration

DoReCo is going to Texas!

DoReCo members and friends are organizing a workshop on “Spoken- and Signed-language Corpus Studies in Linguistic Typology” at the 14th International Conference of the Association for Linguistic Typology, to be held at the University of Texas at Austin, USA, on 15-17 December, 2022 (see Submit your abstract by April 1 and come discuss with us!

Documentary linguistics and corpus phonetics now happily married

An edited volume on “Corpus-Based Typology With Spoken Language Corpora” just appeared (diamond open access) with a contribution by DoReCo PI Frank Seifart on “Combining documentary linguistics and corpus phonetics to advance corpus-based typology“, arguing that Documentary linguistics and corpus phonetics form a happy marriage, an example being – you guessed it – DoReCo! Check out the other excellent contributions, too.

Nikolaus Himmelmann Abralin ao vivo talk

Our advisory board member Nikolaus Himmelmann, who’s also a founding father of Documentary Linguistics and pioneer of corpus-based prosodic typology, will give a presentation on “Universals of Language 3.0” at Abralin ao vivo on Wednesday 12.01.2022, 5:00 PM (UTC) / 6:00 PM (a recording of the talk will also be available anytime after that on the website). Highly recommended!

SLE best PostDoc presentation award for DoReCo’s Matt Stave

Matthew Stave’s presentation on morphological typology was awarded the prestigious award for the best presentation by a PostDoc at this year’s Societas Linguistica Europaea (SLE) conference. The paper analyses 19 DoReCo corpora testing correlations between morphological characteristics associated with ‘agglutinating’ vs. ‘inflecting’ languages. Stay tuned for the write-up of this study!

Matthew Stave, Kilu von Prince & Frank Seifart. 2021. A usage-based approach to morphological typology. Paper presented at the Societas Linguistica Europaea (SLE) 2021. Workshop 14: Integrating sociolinguistics and typological perspectives on language variation, 30 August – 3 September 2021, online.

Phonological vs. morphological complexity in 21 DoReCo languages

Corpus-based measures taken on 21 DoReCo data sets shed new light on an old puzzle: How are phonological and morphological complexity related? It turns out there is a positive typological correlation, specifically between syllable complexity and morphological synthesis, even if looking separately at nouns vs. verbs, and word-initial vs. word-final complexity. Why? Read all about it in Easterday, Stave, Allassonnière-Tang & Seifart’s newest Frontiers paper at

Final lengthening in 17 DoReCo languages

We finalized processing of 17 languages and analyzed these regarding final lengthening. Results were presented at three conferences: the 12th International Seminar on Speech Production (poster), the 18th Old World Conference on Phonology (abstract), and the 43rd Annual Conference of the German Linguistic Society (DGfS) (workshop program). Thanks to the audiences for feedback! Here’s a snapshot of some of the results: