Publication | Closed Access
Applying Conditional Random Fields to Japanese Morphological Analysis
723
Citations
15
References
2004
Year
Unknown Venue
Japanese morphological analysis with CRFs is challenged by ambiguous word boundaries, unlike prior CRF work that assumed fixed boundaries. The paper aims to show how CRFs can be applied to Japanese morphological analysis despite word boundary ambiguity. The authors trained and evaluated CRFs on a standard Japanese morphological analysis corpus, comparing results to HMMs and MEMMs. CRFs resolve long‑standing issues, allow flexible hierarchical tagset features, reduce label and length bias, and outperform HMMs and MEMMs.
This paper presents Japanese morphological analysis based on conditional random fields (CRFs). Previous work in CRFs assumed that observation sequence (word) boundaries were fixed. However, word boundaries are not clear in Japanese, and hence a straightforward application of CRFs is not possible. We show how CRFs can be applied to situations where word boundary ambiguity exists. CRFs offer a solution to the long-standing problems in corpus-based or statistical Japanese morphological analysis. First, flexible feature designs for hierarchical tagsets become possible. Second, influences of label and length bias are minimized. We experiment CRFs on the standard testbed corpus used for Japanese morphological analysis, and evaluate our results using the same experimental dataset as the HMMs and MEMMs previously reported in this task. Our results confirm that CRFs not only solve the long-standing problems but also improve the performance over HMMs and MEMMs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1