Using a Computer Corpus to Supplement a Citation Collection for a Historical Dictionary

Abstract

A major project at the Institute of Lexicography, University of Iceland, is the production of a historical dictionary of Icelandic, for the period 1540 to the present. Material has been collected for the dictionary for over four decades. When the editing process began, it became apparent that the collections of the Institute, though containing millions of dictionary slips, still had numerous gaps which badly needed to be filled. To assist with the search for additional quotations and usage samples we have assembled a computerized corpus containing over 20 million running words. This text-base is used to supplement the collection of dictionary slips. In this paper the background for the creation of the text-base will be described, as well as our experience with using standard Unix utilities to search the base.