Dialect in British Fiction: 1800-1836Funded by The Arts and Humanities Research CouncilSupported by The University of Sheffield
Tagging the Text Extracts

The text extracts were tagged by hand, using xml. The primary tagging categories are as follows:

  • Grammar
    Wherever the grammar diverges from Standard English we have tagged for this. We  have further tagged for clearly identifiable and commonly occurring features including: double negative; nonstandard subject-verb concord; nonstandard conjunction; nonstandard tense; nonstandard determiner; nonstandard personal pronoun; nonstandard relative pronoun; adverb lacks ‘ly’; clefting; a-prefixing.
  • Vocabulary
    Items that were not considered to be Standard English at the time
  • Orthography Contractions
    Where the spelling indicates that rapid speech processes are occurring (e.g. isn’t, ‘tis)
  • Orthographic Respellings
    Where words are respelled in order to indicate something about pronunciation (e.g. fevver for ‘feather’, ‘ospital for ‘hospital). Respellings are further tagged for clearly identifiable and commonly occurring features including: th-stopping, fricative voicing, h-dropping, metathesis, th-fronting, v-w transposition, ‘s’ for ‘c’, ‘t’ insertion on ‘ch’
  • Idiom
    Idiomatic phrases of more than one word
  • Discourse Marker
    A word or phrase that stands by itself, signals speaker attitude and is marked for regional or social (e.g. Lawks, to be sure)
  • Metalanguage
    Where the speaker or narrator comments explicitly on language, particularly language variety

There are considerable challenges inherent in tagging textual material of this type. The primary purpose of this database is to describe how Standard English (the default medium of printed texts in the nineteenth century) is being manipulated by writers in order to represent something about the nonstandard speech of literary characters to a primarily English reading audience. As such, the task is very different from tagging a transcribed corpus of natural language. We have attempted to tag what the writers were doing in order to indicate how the language of the passage diverges from Standard English, and through our tagging enabling users of the database to explore the choices made by those writers.

Inevitably in tagging of this type there is a degree of subjectivity at times and in particular some categories overlap. For example, should the Scots word ‘gude’ be considered as a vocabulary item in its own right, or as a respelling of ‘good’? From the perspective of a contemporary Scots speaker, ‘gude’ is a Scots word. But from the point of view of a nineteenth century English reader it is a respelling whose meaning can easily be retrieved without any special knowledge of Scots vocabulary. We have tried to be consistent in our handling of terms, and to tag in multiple ways where appropriate. However, we expect to uncover some inconsistencies as the database is used and we will be updating our tagging as appropriate on an annual basis.

Version 1.1 (December 2015)Background image reproduced from the Database of Mid Victorian Illustration (DMVI)