Publication | Closed Access
From key words to key semantic domains
720
Citations
19
References
2008
Year
EngineeringTaggingSentence SemanticsSemanticsSemantic WebCorpus LinguisticsLanguage ProcessingText MiningNatural Language ProcessingApplied LinguisticsSemantic ApproachComputational LinguisticsCorpus AnalysisLanguage StudiesKey WordsTerminology ExtractionKey Words MethodSemantic ComputingLexical ResourceKeyword ExtractionLanguage CorpusKey DomainsLinguisticsSemantic Representation
The approach combines corpus‑based and corpus‑driven paradigms in corpus linguistics. The paper extends the key‑words method to compare corpora. The method uses automatic part‑of‑speech and semantic‑field tagging, applies keyness calculations to tag frequency lists, and is implemented in the web tool Wmatrix for comparing election manifestos. Combining key words and key domains lets macroscopic analysis guide microscopic investigation of linguistic features.
This paper reports the extension of the key words method for the comparison of corpora. Using automatic tagging software that assigns part-of-speech and semantic field (domain) tags, a method is described which permits the extraction of key domains by applying the keyness calculation to tag frequency lists. The combination of the key words and key domains methods is shown to allow macroscopic analysis (the study of the characteristics of whole texts or varieties of language) to inform the microscopic level (focussing on the use of a particular linguistic feature) and thereby suggesting those linguistic features which should be investigated further. The resulting ‘data-driven’ approach presented here combines elements of both the ‘corpus-based’ and ‘corpus-driven’ paradigms in corpus linguistics. A web-based tool, Wmatrix, implementing the proposed method is applied in a case study: the comparison of UK 2001 general election manifestos of the Labour and Liberal Democratic parties.
| Year | Citations | |
|---|---|---|
Page 1
Page 1