Workshop
MODIFED: Morphosyntactic Dialect Feature Detection Workshop
- Date
- Thursday 20 June 2024 - Friday 21 June 2024
- Explanation
- Times: 20 June 9:00-17:00; 21 June 9:00 – 14:00
- Location
- P.J. Veth Building, Leiden University
- Room
- P.J. Veth 1.01 & 1.07
MODIFED: Morphosyntactic Dialect Feature Detection
Workshop Description
In the past two decades computational approaches to dialect variation, known as dialectometry, have allowed researchers to work efficiently with large amounts of data and in a data-driven manner define dialect groups, identify specific dialect features and search for general tendencies in language variation. One of the main advantages of the data-driven dialectology is “avoiding the need to select which features to use as the basis of characterization” (Nerbonne, 2008). However, most published studies in dialectometry are based on data extracted from dialect atlases or surveys containing linguistic features carefully selected by human experts. Automatic extraction and analysis of meaningful features from raw text, like interviews, would enable researchers to work with data that has not be chosen by experts and which can be considered unbiased. Despite the attractiveness of this type of approach, automatic feature extraction at all linguistic levels is still challenging (Kroon, 2022) and understudied.
We are excited to announce a MODIFED workshop organized by the Re-examining Dialect Syntax Network (REEDs) and Leiden University Centre for Digital Humanities (LUCDH). This 2-day workshop will take place at Leiden University on Thursday 20 June and Friday 21 June 2024. This event is designed to foster collaboration among specialists in dialectology, computational linguistics, and corpus linguistics, with a focus on identifying morphosyntactic dialect features from various semi-structured and unstructured sources. This workshop will provide an opportunity for researchers and research groups to reflect on theoretical and/or methodological problems and solutions related to automatic morphosyntactic dialect feature extraction.
Invited Speaker:
Our invited speaker on Thursday is Anne Breitbarth, Ghent University on 'Hunting for structures in treebank forests: Considerations on the use of parsed corpora of spontaneous dialect speech'
Workshop Program
9:30-10:00 Registration and Coffee
10:00-10:15 Workshop opening
10:15-11:15 Hunting for structures in treebank forests: Considerations on the use of parsed corpora of spontaneous dialect speech (Abstract)
Anne Breitbarth, Ghent University (Invited talk)
11:15-12:00 Extracting morphological features from published grammars of African Arabic dialects: methodological considerations (Abstract)
Carolina Zucchi, University of Bayreuth
12:00-14:00 Lunch
14:00-14:45 Automatic discovery of phonological and morphological features in dialect corpora with orthographic normalization (Abstract)
Yves Scherrer, University of Helsinki
14:45 -15:30 Extracting dialect features from German social media data using local spatial autocorrelation (Abstract)
Dana Roemling & Jack Grieve, University of Birmingham
15:30-16:00 Coffee break
16:00-16:45 From Feature Extraction to Measuring Dialect Typicality (Abstract)
Matthew Sung, Leiden University
16:45-17:30 Discussion
18:00 CONFERENCE DINNER
09:30-10:00 Registration and Coffee
10:00-11:00 Automatic Detection of Syntactic Differences through the Minimum Description Length principle and feature mapping (Abstract)
Martin Kroon, Utrecht University
11:00-11:45 Drawing on Research on Explainability of Dialect Classifiers to Extract Greek Dialect Features (Abstract)
Erofili Psaltaki and Dana Roemling, University of Helsinki
11:45-12:30 Automatic Detection of Morphosyntactic Dialect Featuresin African American English Oral Histories (Abstract)
Kevin Tang, University of Düsseldorf