Studying Variation in Syntax: A Parsed Corpus of Swiss German

Variability is a characteristic feature of natural language that can be found in various guises: variation between languages or between dialects of the same language; variation among speakers of the same dialect/language (inter-speaker variation); variation within a single individual (intra-speaker variation). Whereas all these types of variation have been extensively studied in areas like phonology or the lexicon, work in syntax has almost exclusively considered the first type of variation only. In particular within formal approaches to syntax, inter-speaker and intra-speaker variation have been neglected for a long time. It is only recently that the interest of these phenomena for formal theories of syntax has been recognized and attempts have been made to reconcile theoretical syntax with variationist approaches of the sociolinguistic kind. The aim of this project is to make a contribution to these endeavours by focusing on syntactic variation in a dialect of Swiss German. The main part of this project will be dedicated to the creation of an appropriate empirical basis for the investigation of syntactic variation. We propose to construct a parsed corpus of naturally occurring speech of one-million words that will allow easy data retrieval for syntactic analysis. Although our main research interest is in syntactic variation, the corpus will also be a valuable tool for researchers who would like to explore questions related to the phonetics, the phonology, the morphology, the lexicon, the semantics, the pragmatics, the syntax, or the discourse structure of Swiss German. Although the last few years have seen the emergence of a large number of publicly available electronic corpora compiled for the purposes of linguistic analysis, to date no such corpus exists for Swiss German. A Swiss German corpus would therefore be a timely addition to the growing number of corpora available around the world. Given that a corpus has to be relatively large to be suitable for syntactic analysis, we intend to concentrate on one variety of Swiss German. Within this variety, we will select informants across three generations and we will aim for gender balance in each of these groups. In the final part of the project, we will use the corpus to explore one area of intra-speaker variation found in Swiss German, the variation illustrated in (1).

(1) … dass de Peter luut gigele tuet / (luut) tuet (luut) gigele
     … that the Peter loudly giggle does / loudly does loudly giggle
     ‘… that Peter giggles loudly’

Speakers of Swiss German allow both the order ‘main verb-auxiliary’ and the order ‘auxiliary-main verb’ in (1). This variation, referred to as Verb (Projection) Raising, has been discussed extensively in the literature. However, no corpus studies have been performed so far to examine the way in which speakers use these options and the potential implications these usage data may have for the theoretical analysis of this optionality. The case study on the variation in (1) is only the starting point for a large number of studies on syntactic variation that can then be carried out on the basis of this parsed corpus of Swiss German. These studies are expected to provide a better understanding of the nature of synchronic variation in syntax. Furthermore, it is only with sociolinguistically balanced corpora of the type proposed here that deeper insights into the interaction between syntactic variation and syntactic change can be gained, as such corpora contain carefully selected data that historical corpora generally do not provide.