Scaling Conditional Random Fields by One-Against-the-Other Decomposition
-
Abstract
As a powerful sequence labeling model,conditional random fields (CRFs) have had successful applications inmany natural language processing (NLP) tasks. However, the highcomplexity of CRFs training only allows a very small tag (orlabel) set,because the training becomes intractable as the tag set enlarges. Thispaper proposes an improved decomposed training and joint decodingalgorithm for CRF learning. Instead of training a single CRF model forall tags, it trains a binary sub-CRF independently for each tag. Anoptimal tag sequence is then produced by a joint decoding algorithmbased on the probabilistic output of all sub-CRFs involved. To test itseffectiveness, we apply this approach to tackling Chinese wordsegmentation (CWS) as a sequence labeling problem. Our evaluation showsthat it can reduce the computational cost of this language processingtask by 40--50\% without any significant performance loss on variouslarge-scale data sets.
-
-