Tải bản đầy đủ (.pdf) (384 trang)

A comparative grammar of british english dialects

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.31 MB, 384 trang )

A Comparative Grammar of British English Dialects




Topics in English Linguistics
50.1

Editors

Elizabeth Closs Traugott
Bernd Kortmann

Mouton de Gruyter
Berlin · New York


A Comparative Grammar
of British English Dialects
Agreement, Gender, Relative Clauses

by

Bernd Kortmann
Tanja Herrmann
Lukas Pietsch
Susanne Wagner

Mouton de Gruyter
Berlin · New York



Mouton de Gruyter (formerly Mouton, The Hague)
is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.

ȍ Printed on acid-free paper which falls within the guidelines
Ț
of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication Data
A comparative grammar of British English dialects : agreement, gender, relative clauses / by Bernd Kortmann … [et al.].
p. cm. Ϫ (Topics in English linguistics ; 50.1)
Includes bibliographical references and index.
Contents: The Freiburg English Dialect Project and corpus / Bernd
Kortmann, Susanne Wagner Ϫ Relative clauses in English dialects of
the British Isles / Tanja Herrmann Ϫ “Some do and some doesn’t” :
verbal concord variation in the north of the British Isles / Lukas
Pietsch Ϫ Gender in English pronouns : southwest England / Susanne Wagner
ISBN 3-11-018299-8 (hardcover : alk. paper)
1. English language Ϫ Dialects Ϫ Great Britain. 2. English language Ϫ Great Britain Ϫ Grammar. 3. English language Ϫ Relative
clauses. 4. English language Ϫ Agreement. 5. English language Ϫ
Gender. I. Kortmann, Bernd, 1960Ϫ II. Series.
PE1721.C66 2005
427Ϫdc22
2005001607

Bibliographic information published by Die Deutsche Bibliothek
Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data is available in the Internet at ϽϾ.

ISBN 3-11-018299-8

” Copyright 2005 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin
All rights reserved, including those of translation into foreign languages. No part of this
book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Cover design: Christopher Schneider, Berlin.
Printed in Germany.


Preface
Bernd Kortmann
Since the 1980s, but especially over the last ten years or so, the study of the
grammar of English dialects has been very much on the rise after more than
a century of neglect in English dialectology and dialectology, in general.
Witness, in particular, Trudgill and Chambers (1991), Milroy and Milroy
(1993), and, on a global scale, Kortmann and Schneider (2004). Apart from
these and several other publications related in spirit, however, the vast
majority of publications on the grammar of English dialects concentrates on
just one particular phenomenon in one particular dialect or dialect area, is
based on a very small database and purely descriptive. Moreover, the small
size of the available databases often makes it very difficult to formulate
valid descriptive generalizations. Virtually non-existent in English
dialectology are systematic comparative studies of individual grammatical
subsystems across a selection of dialects (like comparative studies of the
tense and aspect systems, pronominal systems, relativization or
complementation patterns, etc.). Exceptions in this respect form the
sociolinguistic studies by Tagliamonte and her research team (e.g.
Tagliamonte 1999, 2002, 2003), and the contributions, especially the
regional and global synopses, in Kortmann and Schneider (2004).
However, useful as the synopses are in providing general orientation, they
can be no more than very useful starting-points for systematic comparative
analyses of individual phenomena of dialect grammar.

The present volume, the first in a series of volumes which will be
published at irregular intervals, tries to set an example as to how this gap in
English dialectology can be filled. Secondly, it will do away with another
problem that has beset the study of English dialect syntax for many
decades, namely the lack of a sufficient amount of reliable data. The Survey
of English Dialects, for example, compiled in the 1950s and serving as the
most important data source for English dialectologists and dialect
geographers ever since, was simply not geared to the systematic collection
of data on grammar. Just a fraction of the more than 1300 questions in the
SED questionnaire was explicitly designed to collect morphological and
syntactic information. Only since the late 1980s have efforts been made at
compiling large data collections, such as the Survey of British Dialect


vi

Bernd Kortmann

Grammar (Cheshire, Edwards and Whittle 1989), the Newcastle Electronic
Corpus of Tyneside English (NECTE; see Allen et al., forthcoming), and,
largest of all, the computerized Freiburg English Dialect Corpus. It is the
latter, FRED for short, which will take centre stage in this volume. Thirdly,
all studies in this volume are informed by a typological approach to English
dialect grammar (apart from the fact that two of the dialect phenomena
investigated here, namely the Northern Subject Rule and pronominal
gender, are typologically very rare). This approach is the hallmark of the
Freiburg research group on English dialect syntax, initiated and coordinated
by Bernd Kortmann, and will be outlined in the scene-setting paper by
Kortmann and Wagner. It is in this paper, too, that the nature and design of
FRED, and its advantages for both qualitative and quantitative analyses of

dialect phenomena will be discussed in some detail.
The subject matter of the three studies forming the backbone of this
volume can briefly be characterized as follows: Tanja Herrmann examines
(adnominal) relative clauses in six dialect areas of the British Isles (Central
Midlands, Central North, Central Southwest, East Anglia, Northern Ireland,
Scotland). The results of this cross-dialectal study she relates to typological
hierarchies, particularly to Keenan and Comrie’s (1977) Noun Phrase
Accessibility Hierarchy. The Accessibility Hierarchy is largely verified for
all relative clause formation strategies found in the data, including the zero
relative marker strategy (as in The man ___ called me was our neighbour).
From a diachronic perspective, the Accessibility Hierarchy also helps to
reveal the pattern underlying the way individual relative markers (e.g. the
relative particles as and what) enter or exit an existent relative marker
system.
Lukas Pietsch investigates, synchronically as well as diachronically, the
so-called Northern Subject Rule (NSR), a feature found in the Northern
dialects of England, but also in Scotland, Northern Ireland and the Republic
of Ireland. This rule is concerned with subject-verb agreement, and can
roughly be formulated as follows: every verb in the present tense can take
an s-ending unless its subject is an immediately adjacent simple pronoun.
(Third person singular verbs always take the s-ending, as in Standard
English). In other words, the NSR involves a type-of-subject constraint
(pronoun vs. common/proper noun) and a position constraint (+/immediate adjacency of pronominal subject to verb). Thus, in NSRvarieties we get the following examples: I sing vs. *I sings, Birds sings, and
I sing and dances.


Preface vii

Susanne Wagner, finally, provides a comprehensive account of a special
semantic system of (pronominal) gender marking, which is distinctive of

the traditional dialects in Southwest England. What we encounter in
Somerset, in particular, is pronominal gender that is primarily sensitive to
the mass/count distinction and only secondarily to the animate/inanimate
and human/nonhuman distinction. It is only used for mass nouns. Count
nouns take either he or she: she is used if the count noun refers to a female
human, and he is used for count nouns either referring to male humans or to
nonhuman entities. Thus we get a contrast as in Pass the bread – it’s over
there. (bread = mass noun) and Pass the loaf – he’s over there. (loaf =
count noun). Most gendered pronouns are masculine pronominal forms (he,
him and Southwestern un, en < OE hine) referring to inanimate referents.

Acknowledgments
All authors most gratefully acknowledge the generous support by the
Deutsche Forschungsgemeinschaft. Without the funding of the Projects KO
1181/1-1,2,3 over a five-year period (2000-2005) the studies published here
and the compilation of FRED, the Freiburg English Dialect Corpus, would
have been impossible.

References
Allen, Will, Joan Beal, Karen Corrigan, Warren Maguire, and Hermann Moisl
forthc.
Taming Unconventional Digital Voices: The Newcastle Electronic
Corpus of Tyneside English. In Using Unconventional Digital
Language Corpora. Vol. I: Synchronic Corpora, Joan Beal, Karen
Corrigan and Herman Moisl (eds.). Basingstoke: Palgrave
Macmillan.
Cheshire, Jenny, Viv Edwards, and Pamela Whittle
1989
"Urban British dialect grammar: The question of dialect levelling.
English World-Wide 10: 185-225.

Kortmann, Bernd, Edgar W. Schneider in collaboration with Kate Burridge,
Rajend Mesthrie, and Clive Upton (eds.)
2004
A Handbook of Varieties of English, Vol. 2: Morphology and Syntax.
Berlin/New York: Mouton de Gruyter.


viii Bernd Kortmann
Milroy, John, and Lesley Milroy (eds.)
1993
Real English. The Grammar of English Dialects in the British Isles.
London/New York: Longman.
Tagliamonte, Sali
1999
Was/were variation across the generations: View from the city of
York. Language Variation and Change 10: 153–191.
2002
Variation and change in the British relative marker system. In
Relativisation in the North Sea Littoral, Patricia Poussa (ed.), 147–
165. Munich: LINCOM EUROPA.
2003
‘Every place has a different toll’: Determinants of grammatical
variation in cross-variety perspective. In Determinants of Linguistic
Variation, Günter Rohdenburg, and Britta Mondorf (eds.), 531–554.
Berlin/New York: Mouton de Gruyter.
Trudgill, Peter, and Jack K. Chambers (eds.)
1991
Dialects of English. Studies in Grammatical Variation. London/New
York: Longman.



Table of Contents
Preface............................................................................................... v
Bernd Kortmann
The Freiburg English Dialect Project and Corpus (FRED) .............. 1
Bernd Kortmann and Susanne Wagner
1.
2.
3.

Comparative dialect grammar from a typological perspective.................1
The Freiburg English Dialect Corpus (FRED).........................................4
Linguistic consequences of using oral history material .........................13

Relative clauses in English dialects of the British Isles.................. 21
Tanja Herrmann
Abstract ...............................................................................................................21
1.
Introduction ............................................................................................21
2.
Data ........................................................................................................22
3.
Overall distribution of relative clauses and relative markers .................24
4.
Previous investigations of relative markers............................................28
5.
Restrictiveness/nonrestrictiveness..........................................................38
6.
Personality/nonpersonality .....................................................................41
7.

Preposition placement ............................................................................45
8.
Accessibility Hierarchy ..........................................................................48
9.
Resumptive pronouns.............................................................................70
10.
Which as ‘connector’? ............................................................................87
11.
Conclusion..............................................................................................94
Appendix 1 ...........................................................................................................97
Appendix 2 .........................................................................................................105

“Some do and some doesn’t”: Verbal concord variation in the north
of the British Isles ......................................................................... 125
Lukas Pietsch
Abstract .............................................................................................................125
1.
Introduction ..........................................................................................125
2.
The Northern Subject Rule: Descriptive problems ..............................128
3.
Data from twentieth-century northern dialects.....................................132
4.
The history of the Northern Subject Rule ............................................173
5.
Theoretical accounts of the Northern Subject Rule..............................179
6.
Discussion: Variation and usage-based theories ..................................190



x

Table of Contents

Gender in English pronouns: Southwest England ........................ 211
Susanne Wagner
1.
Introduction ..........................................................................................211
2.
Gendered pronouns ..............................................................................215
3.
Gender in English and elsewhere .........................................................221
4.
The corpora ..........................................................................................235
5.
Special referent classes.........................................................................251
6.
Non-dialectal studies of gender assignment.........................................261
7.
Persistence of gendered pronouns ........................................................275
8.
SED – Basic Material ...........................................................................285
9.
The SED fieldworker notebooks data...................................................292
10.
Southwest England oral history material..............................................319
11.
Material from Newfoundland ...............................................................339
12.
Overall summary ..................................................................................346

Appendix ............................................................................................................353
(Additional) corpus material...............................................................................353

Index.............................................................................................. 368


The Freiburg English Dialect Project and Corpus
(FRED)
Bernd Kortmann and Susanne Wagner

1. Comparative dialect grammar from a typological perspective
The Freiburg project started in the late 1990s and has received major
funding from the Deutsche Forschungsgemeinschaft from spring 2000 until
spring 2005.1 Its basic approach to the study of dialect grammar is informed
by the theoretical and methodological framework of functional (or:
Greenbergian) typology, which is primarily concerned with the patterns and
limits of morphological and syntactic variation across the languages of the
world. The basic idea of the Freiburg project is to adopt functional
typology as an additional reference frame for dialectological research that
fruitfully complements existing approaches. Among other things, this
means that, in a first step, we determine the cross-dialectal variation
observable in individual domains of grammar (in the present volume:
negation, relative clauses, pronominal and agreement systems) before, in a
second step, judging it against the cross-linguistic variation described in
typological studies. Both dialect syntacticians and typologists are bound to
profit from this kind of approach (cf. Kortmann, ed. 2004 for a collection of
studies on dialects in Europe conducted in this spirit). On the one hand,
dialectologists can draw upon a large body of typological insights in,
hypotheses on, and explanations for language variation. Dialect data can
thus be looked at in a fresh light and new questions be asked. On the other

hand, typologists will get a broader and, most likely, more adequate picture
of what a given language is like if they no longer ignore dialectal variation.
In fact, non-standard dialects (as varieties which are almost exclusively
spoken) are bound to be a crucial corrective for typological research, which
is typically (especially for languages with a literary tradition) concerned
with the written standard varieties of languages. Standard British English,
for example, is anything but representative of the vast majority of English
dialects if we think, for instance, of the absence of multiple negation or the
strict division of labour between the Present Perfect and the Simple Past.


2

Bernd Kortmann and Susanne Wagner

In the present volume, the typological perspective is most prominent in
Herrmann’s study on relativization (not surprisingly so, given the fame of
Keenan and Comrie’s NP Accessibility Hierarchy in language typology).
(1)

Accessibility Hierarchy (AH)
subject > direct object > indirect object > oblique > genitive >
object of comparison

According to the AH, if a language can relativize any NP position further
down on the hierarchy, it can also relativize all positions higher up, i.e. to
the left of it. This constraint applies to whatever relativization strategy a
language employs. For the relativization strategy known as zerorelativization (or: gapping) there is thus a clear prediction that the
relativized NP is most likely to be gapped if it is the subject of the relative
clause, next most likely if it is the direct object of the relative clause, etc.

However, this is clearly not the case for Standard English: the direct object
position can be gapped (2a), whereas the subject position cannot (2b):
(2)

a.
b.

The man I called _____ was our neighbour. (direct object)
The man _____ called me was our neighbour. (subject)

*

English dialects, on the other hand, conform to the AH prediction.
Examples like (2b) are nothing unusual, at all; in fact, gapping of the
subject position is an extremely widespread phenomenon in non-standard
varieties of English in and outside the British Isles:
(3)

a.
b.

I have a friend ____ lives over there.
It ain’t the best ones ____ finish first.

So here we have a striking instance of the situation where the non-standard
varieties of English conform to a typological hierarchy whereas the
standard variety does not. For further ways in which the AH is relevant
when looking at other relativization strategies used in English dialects
compare Herrmann’s comparative study of relative clauses in six English
dialect areas in this volume.

Another rewarding area for anybody investigating dialects from a
typological perspective is the study of negation markers and strategies. This
subsystem of English dialect grammar has been investigated in depth by
Lieselotte Anderwald, a senior member of the Freiburg research team (cf.


The Freiburg English Dialect Project and Corpus

3

especially Anderwald 2002a, 2003). For one thing, multiple negation (or:
negative concord) is another striking proof of the typological “wellbehavedness” of non-standard varieties of English (and other Germanic
languages), since multiple negation is the rule for many standard languages
in Europe. Only the standard varieties of Germanic (e.g. Standard English,
Standard Dutch, Standard German) are the exceptions. Furthermore, the
invariant supraregional negation markers don’t (i.e. also for he/she/it don’t)
and ain’t are in full accordance with the powerful typological concept of
markedness: as Greenberg found for many languages, morphological
distinctions tend to be reduced under negation. As Anderwald (2002b,
2004) has also nicely shown, the alleged amn’t gap (Hudson 2000) in
almost all varieties of English (*I amn’t vs. I am not, aren’t I) is anything
but a gap and can indeed be considered an extreme case of local (or:
reversed) markedness. Whereas for all auxiliary verbs negative contraction
(e.g. haven’t, hasn’t, won’t) is vastly preferred over auxiliary contraction
(e.g. ’ve not, ’d not, ’ll not), we get the reverse picture for be. Even isn’t
(12.5%) and aren’t (3.5%) are used very rarely in the British Isles, so that
the near absence of amn’t in standard as well as non-standard varieties is
not a striking exception, but simply the tip of the iceberg.
The motivation for this striking preference of be-contraction over
negative contraction for all other auxiliaries is most likely a cognitive one,

namely the extremely low semantic content of be. This leads on to another
typological concept which can be usefully applied to the interpretation of
dialect facts: iconicity. In the case of be-contraction we find an extreme
formal reduction of a semantically near-empty auxiliary. In other words,
the amount of coding material matches the semantic content to be coded.
Another case in point is the fact that quite a number of non-standard
varieties in the British Isles and, in fact, around the world have made new
use of the number distinction for Past Tense be, i.e. the was-were
distinction (cf. Anderwald 2002a). These varieties use was for all persons
in the singular and plural in affirmative sentences, while using weren’t for
all persons in singular and plural in negative sentences, thus
remorphologizing the number distinction of Standard English as a polarity
distinction. What we have here is a showcase example of iconicity: a
maximal difference in form (was vs. weren’t) codes a maximal semantic
and cognitive difference (affirmation vs. negation). The relevant nonstandard varieties of English have clearly developed a more iconic polarity
pattern than Standard English has.


4

Bernd Kortmann and Susanne Wagner

Having outlined and illustrated the typological approach to comparative
dialect grammar established by the Freiburg research group, we need to
mention at least briefly that there is of course also a generativist perspective
from which especially the dialect grammars of Italian, Dutch and German
(much less so English dialects) have been investigated. As a matter of fact,
generativists discovered the significance of dialect syntax for linguistic
theorizing and models of syntax much earlier than typologists. This
generativist interest in microparametric syntax began with the advent of the

Principles and Parameters approach in the 1980s and has steadily increased
ever since (cf., for example, Black and Motapanyane 1996 or various
contributions to Barbiers, Cornips and van der Kleij 2002), finding its way
even into generative theories based on Optimality Theory (cf., for example,
the Stochastic OT account of morphosyntactic variation in English by
Bresnan and Deo 2001). In this volume, the generativist perspective will
come in only where relevant publications on the dialect phenomena
investigated here exist.

2. The Freiburg English Dialect Corpus (FRED)
Given the aims of the Freiburg project it was first of all necessary to
compile a database which would allow to conduct serious qualitative and
quantitative morphosyntactic research across English dialects. The result is
the computerized Freiburg English Dialect Corpus (FRED), which has
been compiled over a period of roughly five years (including the
digitization of some 120 hours of audio material). FRED consists of
approximately 2.5 million words, with representative subsamples for all
English dialect areas including data from Scotland and Wales. The data in
FRED are orthographically transcribed interviews collected for the most
part during the 1970s and 1980s in the course of oral history projects all
over the British Isles. The majority of the informants are born between
1890 and 1920, i.e. are roughly a generation younger than the generation of
informants who were recruited for the Survey of English Dialects (SED).

2.1. Principles of compilation
Firstly, the corpus was designated to permit the investigation of phenomena
of non-standard morphosyntax (rather than analyses of phonetic or


The Freiburg English Dialect Project and Corpus


5

phonological details). Features of syntax are – almost by definition – much
rarer than features of phonetics and phonology and very large quantities of
text are therefore necessary. (According to some estimates, about 40 times
the amount of text is needed for a syntactic analysis as opposed to a
phonetic one.) This considerably restricted the practicality of collecting our
own corpus from scratch. Instead, we decided to try to compile a corpus
from materials that were already available. We decided against collecting
material with the help of questionnaires in the first phase of the project.
Questionnaires were however designed and distributed in the second phase
of the project when, on the basis of extensive corpus analyses, interesting,
transitional or rare phenomena became apparent that could not be further
investigated with the help of FRED alone.
Secondly, we decided to collect material that would best be classified as
traditional dialect data. This means that we explicitly tried to find material
from speakers who grew up before the Second World War, as this date
seems to be the major cataclysmic event after which wide-ranging social
and economic changes (with concomitant linguistic changes) came into
effect. For example, highly increased mobility after WWII led to dialect
levelling on a hitherto unknown scale (see for example Williams and
Kerswill 1999: 149); mass affluence resulted, amongst other things, in
television sets becoming easily available and spreading at least passive
knowledge of the standard language; increased public spending made sure
that education changed not only qualitatively but also quantitatively, such
that children leaving school at age 11 or 12 – not unusual for lower class
children only 60 or 70 years ago – is no longer possible, and so on. Only by
concentrating on speakers born before WWII could we at least have a
chance that our data would still be “dialectal” in a regional sense, and be

comparable to older dialect descriptions and dialect data (on the
background of speaker selection for the SED, see Orton 1962: 14). There
are a number of other arguments and a priori considerations which
contributed to this decision: We had established contact with various
researchers, research groups and private individuals who were either in
possession of similar materials or were already working with such data, and
who had kindly offered us access to them. Moreover, the only existing
sources on variation in morphosyntax are based on traditional material,
most importantly the SED (Orton et al. 1962–1971). To guarantee
comparability between these materials, it was essential that FRED should
also consist of traditional dialect material without having to take into
account factors like mobility or the influence of mass media.


6

Bernd Kortmann and Susanne Wagner

Due also to time constraints, but mainly for the reasons detailed above,
it was considered impossible from the outset to record, digitise and
transcribe all data that should make up FRED ourselves. Based on our
research objectives, we were looking for large quantities of traditional
regional speech, preferably by older local speakers with strong family
affiliations in the area, that would record the use of speakers who grew up
before WWII, or even better, before WWI. This meant that we were
looking for material preferably from the 1970s and 1980s, recording older
speakers, or from the 1990s, if these recorded very old speakers. Our
material had to be recorded in acceptable quality for linguistic analysis,
ideally even including transcripts that were reliable on a word-by-word
basis, and – most important of all – the material had to be more or less

freely available to us as researchers who had not originally been part of the
research design. These criteria suggested a new source that has so far not –
or hardly – been used for dialectological purposes, namely tape recordings
and transcripts from oral history projects.

2.2. The role of oral history
As defined by the Oral History Society, “[o]ral history is the recording of
people’s memories. It is the living history of everyone’s unique life
experiences” (Oral History Society at ). Oral
history collections sometimes originate from projects (short- or long-term)
undertaken by an individual (sometimes also a group of individuals or an
institution), typically lay persons, not professional historians, with an
interest in a specific theme or topic, often just recording life memories.
Such a focus has certain implications concerning the content2 and general
circumstances of an interview. Interviewees are generally pensioners in
their 60s or older, and only rarely do we find projects that have as many
female as male speakers.3
The recording situation makes oral history material ideal for linguistic
investigation. The interviewers were usually true insiders, coming from the
area, often still speaking the dialect themselves, which tends to relax the
interview situation considerably. A second advantage is that the speaker’s
attention was genuinely on what was being said, rather than on how it was
being said. Fortunately, the Oral History Society advises all potential
interviewers to give a copy of the tapes to their local library or archive4,


The Freiburg English Dialect Project and Corpus

7


and these are the places where oral history material can be found today
across Great Britain.5

2.3. From original recording to text in FRED – the steps and processes
Members of the Oral History Society are advised to at least “[w]rite a
synopsis of the interview which briefly lists in order all the main themes,
topics and stories discussed”;6 verbatim transcripts are not explicitly
mentioned. But, of course, for anyone thinking about long-term work with
the material, a transcript is a very good way of allowing people from
outside to get an impression of the content of the interview without actually
having to listen to the tapes, which is a very time-consuming business. The
intentions for the future use of transcripts largely determine how the
interview was transcribed, “how” here referring particularly to the
(unfortunately very common) practice of “normalizing” the speakers’
language. Since oral history projects as a rule do not involve the
employment of a professional transcriber, this is the usual course of events,
which is of course perfectly justified for oral history purposes. Just to give
one example consisting of several actually occurring utterances, consider
(4) which could end up as (5):
(4)
(5)

That pot? Oh, I, I don’t know, I don’t remember what I made he for.
I don’t collect no pots now.
I don’t remember what I made that pot for. I don’t collect pots now.

“Normalization” here has eradicated three morphosyntactic dialect features
(he = pot; he here used in an oblique context; double negation don’t ... no),
not to mention all the “superfluous data” (repetitions and so on) that are
simply left out. This kind of standardized re-written text is of course much

more useful to the general public than a transcript that uses so many
instances of “eye-dialect” to represent non-standard pronunciation as well
as dialect that it is difficult to follow the line of argument.
Despite the obvious linguistic drawbacks, the Freiburg research group
was very glad to have transcripts of at least some of its material. Although
these were highly deficient from a dialectological point of view, they at
least solved such difficult problems for us as deciphering correctly some
specialist vocabulary, unusual place names, personal names and so on, such
as the names of different apple sorts used for cider making. We then


8

Bernd Kortmann and Susanne Wagner

carefully compared existing transcripts with the original tapes and reinserted all morphological, syntactic and discourse features, taking out
irrelevant phonetic or phonological features and features of pure eye-dialect
(compare 6 with 6'):
(6)
(6')

And the farmer wot my gran used ter ee used ter have a white high
healed collar.
#And the farmer what my gran used to, he used to have a white high
healed collar. (FRED Wil_011)

For the rest of the material where no transcripts were available, we
transcribed the original tapes ourselves, mostly with the help of native
speakers who either worked on the project or were associated with it in
related research projects. In addition, all transcripts were carefully checked

by dialectologically trained research assistants.
As a result, the actual transcripts used for FRED are verbatim
equivalents of the spoken versions: hesitations, repetitions, false starts of
the same sentence and so on are all included.
(7)

#Oh well now, I tell you, when I first made my will, Mr (gap ‘name’)
my lawyer, (unclear) oh yes (/unclear) he’s still alive, (trunc) I(/trunc) I told him I I I says, I want to leave the Salvation Army a bit
of money, and I have done. (FRED Nott_016)

In addition, as stated above, and most importantly for our research
purposes, all morphosyntactic dialect features have been reinserted
(indicated in bold print below).
(8)

… there used to be a Ginnet what we used to call was a Ginnet, he
were a nice eating apple, a nice sweet apple and a good apple for
cider. #When them apples were ripe you could pick them up and
could press them like that and you’d see your thumb mark in them or
any apple really when he’s ripe, wadn’t it, but when he’s not ripe
he’s hard, isn’t he, (unclear) you ain’t gonna, (/unclear) well,
anything at all. (FRED Som_001)

Among the features likely to have been “corrected” in the original
transcript are, as in (8) for example, a what-relative, demonstrative them
and “gendered” pronouns.


The Freiburg English Dialect Project and Corpus


9

A variety of phonological features were also kept, either if they were
already represented in the original transcripts, or if we suspected that they
might interact with morphosyntax, for example contracted forms like
wanna, gonna, s’pose and so on. It should be noted that we use the semiphonetic form mi for /mi:/ used as the possessive pronoun not as “eyedialect” but in order to facilitate searches. (For severe criticism of
gratuitous eye-dialect, see for example Preston 1985, 2000). The
orthographic form me, although widespread in other corpora, not only
suggests a certain etymology for this form (at worst, a “substitution” of the
object form of the personal pronoun for the possessive function), but also
complicates computer-based searches considerably, as all instances of the
object case of the personal pronoun (He saw me) have to be manually
excluded, at least as long as the corpus is not tagged for word class yet.
We also represent certain paralinguistic features like laughter, long
pauses, or indistinct stretches of conversation (marked as gaps, unclear
passages or truncated words; see also the examples above). All these
features are indicated in the transcripts by specific tags to minimize the risk
of ambiguities.7 This opens up the possibility for analyses on a pragmatic or
discourse level. In this way we have tried to remedy the linguistic
shortcomings of the original oral history material as far as possible.
As mentioned above, extralinguistic variables in FRED are constrained
by intention – FRED is not designed to be a representative sociolinguistic
corpus, but a regionally representative corpus of as broad a dialect speech
as possible. As has already been pointed out, our oral history projects
concentrated on interviewing older people. These older people are typically
very local, that is they still live in the place where they were born, without
having moved outside the region for any considerable stretch of time. Also,
typical FRED speakers usually left school about the age of fourteen, often
much earlier, certainly not progressing to higher education. Finally, most of
our speakers are male – as is well known, women tend to use more

prestigious, in many cases more standard forms of speech where these are
available to them (see for example Chambers and Trudgill 1998²: 30). In
other words, most of our speakers would qualify in dialectology as typical
NORMs (see Chambers and Trudgill 1998²: 29), that is non-mobile old
rural male speakers with little education. Although this restricts the range
of investigations that can be conducted with the help of FRED in
sociolinguistic terms, it represents exactly the same bias as in earlier
dialectological work, where we find a preponderance of NORM speakers as


10

Bernd Kortmann and Susanne Wagner

well, so that results from work on FRED will be comparable to earlier
studies or to material from earlier investigations.

2.4. Advantages and disadvantages of orthographic transcripts
FRED is transcribed orthographically, as are most computerized corpora. A
number of factors – besides a simple realistic evaluation of our resources –
had made it clear from the beginning of the project that orthographically
transcribing the dialect material would be the only viable (short-term)
procedure. First, we had been granted only a restricted amount of time (and
funding) to complete the compilation and transcription of the corpus, and
there has to be a natural trade-off between the detail of a transcription
(depth) and the coverage (breadth). Our aim was a large corpus that would
cover a number of dialect regions, and so we had to trade in phonetic detail.
Moreover, since our explicit focus was on morphosyntactic variability, for
all relevant features of dialect grammar that we expected to investigate and
that are discussed in the dialectological literature, a phonetic or phonemic

transcription would not only have been unnecessary, but even
counterproductive in many cases. For example, one major drawback of a
non-orthographic transcript concerns comparability. A non-orthographic
transcript would dramatically hamper the feasibility of searching for all
tokens of a certain type (for instance be, personal pronouns, and so on), as
the researcher would have no clue which forms to look for without
knowing which realisations actually occur in a given interview (or even
across all interviews). As a result, one would have to return to the
procedure that was common in corpus linguistics before the advent of
computers: reading through the texts and marking all forms of interest in
the process – certainly not an ideal situation. Finally, only an orthographic
transcription of the data met the other requirements of our corpus: the
finished corpus was intended to be machine-readable, enabling easy access,
a variety of searches with various tools, and, most importantly,
comparability with other corpora/projects.
As has been mentioned above, research ties between the Freiburg team
and similar projects had been established. Since most of these projects were
working with orthographic transcripts, this lent additional support to the
decision to use orthographic transcription for FRED. Moreover
orthographic transcription would allow us to compare our data with older
collections and enable us to make comparisons between different speakers,


The Freiburg English Dialect Project and Corpus

11

different dialects, different dialect areas, and different corpora. A further
advantage of orthographic transcription is the concentration on real
(morphosyntactic) dialect features, as phonetically exceptional forms do not

distract the analyst’s eye from the task at hand.
All transcription conventions have of course been documented. Thus in
many cases phonetic peculiarities may be traced from the transcription and
the additional databases alone without having to return to the sound files.
An alignment of sound and transcripts is planned for the near future.

2.5.

FRED – corpus design and area coverage

2.5.1. Word counts and areal distribution
FRED consists of 370 texts, which total roughly 2.5 million words of text or
about 300 hours of speech, excluding all interviewer utterances (see Table 1).
The FRED material is broadly subdivided to cover nine major dialect areas,
following Trudgill’s “modern dialects” division of Great Britain (see Trudgill
1999²: 65).
Table 1.

FRED word counts and areal distribution

dialect area

size (in thousands of words)

% of total

Southwest (SW)

571


23

Southeast (SE)

643

26

Midlands (Mid)

359

15

North (N)

434

18

Scottish Lowlands (ScL)

169

7

Scottish Highlands (ScH)

23


1

151

6

Isle of Man (Man)

10

1

Wales (Wal)

89

4

2,449

100%

Hebrides (Heb)

total


12

Bernd Kortmann and Susanne Wagner


Each dialect area is subdivided into different counties. A detailed
breakdown of counties can be found on the project website
( />
2.5.2. Speakers
FRED contains data from 420 different speakers (excluding interviewers):
268 (63.8 per cent) are male, and 127 (30.2 per cent) are female (gender is
unknown for the rest). In all, 77.2 per cent of the textual material in FRED
is produced by male speakers, and 21.4 per cent by female speakers.
The age of speakers included in FRED ranges from six years to 102
years, with a mean age of 75.2 years. A breakdown of age groups,
according to the amount of text produced by them, is given in Table 2. As
can be seen, about three quarters of the textual material in FRED is
produced by speakers older than 60 years.
Table 2.

FRED – speakers’ ages

age group
0 – 14 years
15 – 24 years
25 – 34 years
35 – 44 years
45 – 59 years
60+ years
unknown

number of speakers
9
14

2
2
14
233
145

% of textual material in
corpus produced
0.5%
1.2%
0.2%
0.1%
3.8%
74.8%
19.4%

The oldest of FRED’s speakers was born in 1877. Overall, 14 speakers (3.3
per cent) were born between 1880 and 1889, 60 speakers (14.3 per cent)
were born between 1890 and 1899, 96 speakers (22.9 per cent) were born
between 1900 and 1909, and 64 speakers (15.2 per cent) were born between
1910 and 1919. This means that 89 per cent of all speakers in FRED were
born before 1920.


The Freiburg English Dialect Project and Corpus

13

2.5.3. Recordings
The material included in FRED was recorded between 1968 and 1999. A

detailed breakdown of recording dates can be found in Table 3. Over two
thirds of all interviews were thus conducted in the 1970s and 1980s,
guaranteeing comparability with much dialectological work conducted at
that time.
Table 3.

FRED – interview recording dates

recording date
1961–69
1970–79
1980–89
1990–99
unknown

number of speakers
2
122
163
61
71

% of all speakers
0.5%
29.1%
38.9%
14.6%
16.9%

3. Linguistic consequences of using oral history material

The decision to base the FRED corpus predominantly on sources of oral
history projects has had a range of linguistic consequences, some of them
foreseen, others not predictable at the outset. Perhaps the most clearly
predictable linguistic consequences stem from the fact that oral history
material necessarily involves the speaker talking about his or her life story
at great length – very often, in fact, the speakers are actively encouraged to
talk almost exclusively about their past. In the realm of tense and aspect, a
predominance of past time narratives implies a predominance of past tense
contexts (although not infrequently, of course, stretches of past time
narratives are narrated in the historical present tense as well). This is an
advantage for studies concentrating on past tense paradigms (for example
Anderwald in progress), but a clear disadvantage for any investigation into
the present tense, as the data typically yields too few examples to make a
regional comparison reliable (see Anderwald 2004). It also means that any
features that are linked to the present tense domain can be expected (and
indeed shown) to be underrepresented: for example use of the (present)
progressive vs. the simple form; forms for the “recent past” (for example
the “after”-perfect in Hiberno-English); uses, if any, of a habitual present
and so on.


14

Bernd Kortmann and Susanne Wagner

A second feature one would expect, considering the fact that FRED
speakers tend to tell their own life stories, is a skewing in pronoun
frequencies. Based on the monologic nature of many of the interviews in
FRED, we might expect first person singular and first person plural
contexts to be over-represented. However, a comparison with the more

balanced demographic part of the British National Corpus (BNC) that
records everyday spontaneous conversations reveals that this is not the case
(see Table 4).
Table 4.

Personal pronouns in FRED and the BNC
BNC spoken

FRED
Pronoun
I
he
she
it
we
you
they
total

occ.
61,458
29,733
9,418
41,776
27,240
54,163
38,608
262,396

% of total

23.4
11.3
3.6
15.9
10.4
20.6
14.7

occ.
309,797
75,442
42,879
254,049
108,698
268,642
96,672
1,156,179

% of total
26.8
6.5
3.7
22.0
9.4
23.2
8.4

Despite the impression one gets when reading through FRED transcripts,
first person contexts are not over-represented in the corpus, but account for
roughly one third of all personal pronoun contexts in both FRED and the

spoken part of the BNC. Although there are slight deviations in frequencies
for individual third person contexts (which can easily be explained on the
basis of the nature of the recording situations), the overall frequency of first
versus third person contexts is surprisingly similar at 33.8 per cent versus
45.5 per cent in FRED and 36.2 per cent versus 40.6 per cent in the BNC
(spoken). Based on these figures, we expect that comparative analyses of
FRED and other corpora of spoken English involving the category person
will produce representative results.
Finally, in the realm of discourse, it has to be stressed that FRED does
not contain genuinely spontaneous, everyday conversations, as for example
the BNC does. In the worst case, some (but fortunately only a tiny minority
of) speakers actually read from prepared notes, as witnessed by pages
rustling in the background and distinctive pauses where pages are turned.
Although this worst case is mercifully rare, many interviews are


The Freiburg English Dialect Project and Corpus

15

nevertheless monologic – understandably, the interviewers tried to keep in
the background most of the time. FRED for this reason would probably not
lend itself well to the investigation of discourse strategies. However, this
limitation is probably not specific to FRED, but applies to dialectological
and sociolinguistic interviews alike, as the main objective is always to
record the speakers’ speech, rather than one’s own (see Feagin 2002).
The nature of the data in FRED influenced the choice of the phenomena
which have been investigated in, so far, four Ph.D. theses and about a
dozen Masters theses. The former include those by Herrmann, Pietsch and
Wagner, which are presented in revised and shortened versions in the

present volume. In all these studies the focus has been on high-frequency
morphosyntactic phenomena. Moreover, the machine-readability of FRED,
which allows analyses via automatic text retrieval programmes like TACT
or WordSmith, has also influenced the methodology, in that for the first
time it is possible to conduct not only qualitative, but also quantitative
studies of dialect morphosyntax applying established corpus-linguistic
techniques.
Comparisons across the whole FRED material have not been possible
for very long yet, so most truly comparative projects by members of the
Freiburg research team are currently still work in progress. These include
cross-dialectal comparisons of multiple negation (Anderwald to appear
2005), past tense paradigms (Anderwald in progress), pronoun systems
(Hernández in progress), complementation patterns (Kolbe in progress),
and for several areas of dialect morphosyntax the phenomenon of priming
(Szmrecsanyi 2004 and 2005). In addition, a whole range of Masters theses
have been completed or are currently under way on the basis of material
from FRED. For further information on FRED, the Freiburg project and
future perspectives of (English) dialect syntax, especially from a
typological perspective, compare Kortmann 2002, 2003, 2004; Anderwald
and Wagner 2005.

Notes
1.

During the funding period (2000-2005) of Project KO 1181/1-1,2,3 a dozen
Masters theses and four Ph.D. theses were completed. Two doctoral theses,
two postdoctoral theses and several Masters theses based on FRED are well
under way. A selection of studies which has grown out of this project is given
in the references (marked with a superscript * preceding the publication year).



×