Publication | Closed Access
Is linguistic information relevant for the classification of legal texts?
36
Citations
20
References
2005
Year
Unknown Venue
EngineeringPart-of-speech TaggingLawLinguistic Information RelevantLegal StudyCorpus LinguisticsText MiningApplied LinguisticsNatural Language ProcessingSupport Vector MachineForensic LinguisticsLanguage DocumentationInformation RetrievalData MiningComputational LinguisticsDocument ClassificationText ClassificationLegal Information RetrievalLanguage StudiesAutomatic ClassificationKnowledge DiscoveryIntelligent ClassificationLegal InformationConcepts.support Vector MachinesClassificationLegal LanguageLinguistics
Text classification is an important task in the legal domain. In fact, most of the legal information is stored as text in a quite unstructured format and it is important to be able to automatically classify these texts into a predefined set of concepts.Support Vector Machines (SVM), a machine learning algorithm, has shown to be a good classifier for text bases [12]. In this paper, SVMs are applied to the classification of European Portuguese legal texts - the Portuguese Attorney General's Office Decisions - and the relevance of linguistic information in this domain, namely lemmatisation and part-of-speech tags, is evaluated.The obtained results show that some linguistic information (namely, lemmatisation and the part-of-speech tags) can be successfully used to improve the classification results and, simultaneously, to decrease the number of features needed by the learning algorithm.
| Year | Citations | |
|---|---|---|
Page 1
Page 1