Sociocultural change and multi-word sequences: perspectives from the Swiss Text Corpus

Buerki, Andreas 2011. Sociocultural change and multi-word sequences: perspectives from the Swiss Text Corpus. Presented at: Language as a Social and Cultural Practice, Basel, 8 -10 June 2011.

Multi-word sequences (MWSs), here defined as recurring word sequences displaying semantic unity, are now recognised as being far more central to language than was previously thought (Sinclair 1991; Altenberg 1998). Argued to be complex in form but simplex in function and in terms of processing (Wray 2002), MWSs appear to reflect habitual thought particularly pertinently (Wang 1991; Feilke 1994). Diachronically, it was shown that significant frequency changes in MWSs are often linked to changes in the sociocultural environment (Buerki 2010). The present paper focuses on such links and suggests that the relationship between MWSs and sociocultural change goes beyond a simple reflection of topics and buzzwords in language. A rather more profound relationship incorporating what might be termed systemic aspects of language is suggested on the basis of corpus evidence. The data for this study were taken from the Swiss Text Corpus, a 20-million word diachronic corpus of written standard German as used in Switzerland, spanning from 1900 to 2000 (Bickel et al. 2009). The corpus was divided into five temporally-ordered subcorpora covering two decades each. MWSs were extracted using UNIX shell scripts and the N-Gram Statistics Package (Banerjee, & Pedersen 2003, Wilmsmann 2007). Resulting n-grams of 2 to 7 words in length were filtered to yield MWSs. Changes in frequencies across the five time periods were identified and those that could best be explained as motivated by sociocultural change were studied in more detail. They included linguistically less interesting phenomena like changing frequencies of proper names (see also Bubenhofer 2009:209, Michel 2011) as well as more interesting, less obviously socioculturally motivated changes. An example of the latter is the MWS die Forderung nach [etwas] (the demand for [sth]) which appears perfectly usual to speakers of German now, but only appeared as a MWS in the 1940s and 50s and then peaked in frequency in the 1960s and 70s. In the paper, examples of this type are presented and possible implications for views of language and culture are discussed. References: Altenberg, B. 1998, Phraseology: theory, analysis and applications, in AP Cowie (ed), On the phraseology of spoken English: The evidence of recurrent word-combinations, Oxford: Clarendon Press. Banerjee, S. & Pedersen, T., 2003, Proceedings of the 4th International Conference on Intelligent Text Processing and Computational Linguistics. Mexico City, The Design, Implementation and Use of the Ngram Statistics Package.. Bickel, H., Gasser, M., Häcki Buhofer, A., Hofer, L. & Schön, C.h., 2009, Schweizer Text Korpus - Theoretische Grundlagen Korpusdesign und Abfragemöglichkeiten, Linguistik Online, 39(3). Bubenhofer, N., 2009, Sprachgebrauchsmuster: Korpuslinguistik als Methode der Diskurs- und Kulturanalyse, Walter de Gruyter, Berlin. Buerki, A., 2010, All sorts of change: a preliminary typology of change in multi-word sequences in the Swiss Text Corpus, presentation at the FLaRN 2010 conference, 23-26 March 2010, University of Paderborn, Paderborn (D) Feilke, H., 1994, Common sense-Kompetenz, Frankfurt am Main: Suhrkamp. Michel, J.B., Shen, Y.K., Aiden, A.P., Veres, A., Gray, M.K., Pickett, J.P., Hoiberg, D., Clancy, D., Norvig, P. & Orwant, J., 2011, 'Quantitative Analysis of Culture Using Millions of Digitized Books', Science, 331(6014), pp. 176-82. Sinclair, J., 1991, Corpus, Concordance, Collocation, Oxford: Oxford University Press. Wang, W.S., 1991, Explorations in Language, Taipei: Pyramid Press. Wilmsmann, B., 2007, Re-write of Text-NSP, [online] Wray, A., 2002, Formulaic language and the lexicon, Cambridge: Cambridge University Press.

