Nick Groom

Closed-class keywords and corpus-driven discourse analysis


The advent of computer-assisted keyword analysis has opened up a range of exciting new possibilities for researchers interested in studying the meanings, values and attitudes associated with particular discourse communities. However, keyword lists extracted from corpora representing specialized discourses typically contain hundreds or even thousands of candidate items for analysis, even when extremely high statistical cut-off values are applied (e.g. p = .0000001). This being the case, the question arises as to how a principled and yet maximally useful selection of items for closer qualitative analysis might be made.

A simple and frequently applied solution is to focus exclusively on the words at the very top of the list, and to disregard anything that falls below an arbitrary cut-off point. Another commonly (and in practice often jointly) applied solution is to prioritise certain types of keyword over others, or even to discard some keyword types from the analysis altogether. Usually this involves discarding any keywords that belong to closed grammatical classes (i.e. conjunctions, determiners, prepositions and pronouns), on the grounds that they are not indicative of the semantic content of a corpus and thus not germane to the concerns of the discourse analyst.

In this talk, however, I will present a case for pursuing exactly the reverse strategy. That is, I will make a case for discarding all of the open-class items (i.e. the nouns, verbs, adjectives and adverbs) in a keywords list as a preliminary step, and focusing instead on the closed-class keywords that remain. My argument is divided into two parts. The first part argues why, contrary to conventional wisdom among discourse researchers, closed-class keywords constitute valid and productive objects of study. In the second part, a detailed rationale for focusing exclusively on closed-class keywords is provided. Throughout, the argument will be illustrated with examples from my own ongoing research into the academic discourses of history and literary criticism.

