Loading [MathJax]/jax/output/SVG/jax.js
We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Hai Zhao, Chunyu Kit. Scaling Conditional Random Fields by One-Against-the-Other Decomposition[J]. Journal of Computer Science and Technology, 2008, 23(4): 612-619.
Citation: Hai Zhao, Chunyu Kit. Scaling Conditional Random Fields by One-Against-the-Other Decomposition[J]. Journal of Computer Science and Technology, 2008, 23(4): 612-619.

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

More Information
  • Received Date: August 03, 2007
  • Revised Date: December 28, 2007
  • Published Date: July 09, 2008
  • As a powerful sequence labeling model,conditional random fields (CRFs) have had successful applications inmany natural language processing (NLP) tasks. However, the highcomplexity of CRFs training only allows a very small tag (orlabel) set,because the training becomes intractable as the tag set enlarges. Thispaper proposes an improved decomposed training and joint decodingalgorithm for CRF learning. Instead of training a single CRF model forall tags, it trains a binary sub-CRF independently for each tag. Anoptimal tag sequence is then produced by a joint decoding algorithmbased on the probabilistic output of all sub-CRFs involved. To test itseffectiveness, we apply this approach to tackling Chinese wordsegmentation (CWS) as a sequence labeling problem. Our evaluation showsthat it can reduce the computational cost of this language processingtask by 40--50\% without any significant performance loss on variouslarge-scale data sets.
  • [1] Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In {\it Proc. the Eighteenth International Conference on Machine Learning, ICML'01}, Williams College: Morgan Kaufmann Publishers Inc., USA, 2001, pp.282--289.
    [2]
    } Rosenfeld B, Feldman R, Fresko M. A systematic cross-comparison of sequence classifiers. In {\it Proc. SDM 2006}, Bethesda, Maryland, 2006, pp.563---567.
    [3]
    } Sha F, Pereira F. Shallow parsing with conditional random fields. In {\it Proc. the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology}, Edmonton, Canada, Vol. 1, 2003, pp.134--141.
    [4]
    } Wallach H M. Efficient training of conditional random fields [Thesis]. Division of Informatics, University of Edinburgh, 2002.
    [5]
    } Viterbi A J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. {\it IEEE Transactions on Information Theory}, 1967, 13(2): 260--269.
    [6]
    } Cohn T, Smith A, Osborne M. Scaling conditional random fields using error-correcting codes. In {\it Proc. the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05)}, Ann Arbor, Michigan: Association for Computational Linguistics, June 2005, pp.10--17.
    [7]
    } Hsu C W, Lin C J. A comparison of methods for multi-class support vector machines. {\it IEEE Transactions on Neural Networks}, 2002, 13(2): 415--425.
    [8]
    } Sutton C, McCallum A. Piecewise pseudolikelihood for efficient training of conditional random fields. In {\it Proc. the 24th International Conference on Machine Learning}, Corvalis, Oregon, ACM Press, June 20--24 2007, pp.863--870.
    [9]
    } Toutanova K, Klein D, Manning C, Singer Y. Feature-rich part-of-speech tagging with a cyclic dependency network. In {\it Proc. HLT-NAACL'03}, Edmonton, Canada, May 27--June 1, 2003, pp.252--259.
    [10]
    } V Punyakanok, D Roth, W tau Yih, D Zimak. Learning and inference over constrained output. In {\it Proc. IJCAI 2005}, Edinburgh, Scotland, July 30--August 5, 2005, pp.1124--1129.
    [11]
    } Abbeel P, Koller D, Ng A Y. Learning factor graphs in polynomial time and sample complexity. {\it The Journal of Machine Learning Research}, 2006, 7: 1743--1788.
    [12]
    } McCallum A, Sutton C. Piecewise training with parameter independence diagrams: Comparing globally- and locally-trained linear-chain CRFs. {\it Tech. Rep. IR-383}, Center for Intelligent Information Retrieval, University of Massachusetts, 2004, presented at NIPS 2004 Workshop on Learning with Structured Outputs.
    [13]
    } Xue N. Chinese word segmentation as character tagging. {\it Computational Linguistics and Chinese Language Processing}, 2003, 8(1): 29--48.
    [14]
    } Peng F, Feng F, McCallum A. Chinese segmentation and new word detection using conditional random fields. In {\it Proc. COLING 2004}, Geneva, Switzerland, August 23--27, 2004, pp.562--568.
    [15]
    } Tseng H, Chang P, Andrew G, Jurafsky D, Manning C. A conditional random field word segmenter for SIGHAN bakeoff 2005. In {\it Proc. the Fourth SIGHAN Workshop on Chinese Language Processing}, Jeju Island, Korea, October 14--15, 2005, pp.168--171.
    [16]
    } Tsai R T H, Hung H C, Sung C L, Dai H J, Hsu W L. On closed task of Chinese word segmentation: An improved CRF model coupled with character clustering and automatically generated template matching. In {\it Proc. the Fifth SIGHAN Workshop on Chinese Language Processing}, Sydney, Australia, July 22--23, 2006, pp.108--117.
    [17]
    } Zhao H, Huang C N, Li M. An improved Chinese word segmentation system with conditional random field. In {\it Proc. the Fifth SIGHAN Workshop on Chinese Language Processing}, Sydney, Australia, July 22--23, 2006, pp.162--165.
    [18]
    } Zhang R, Kikui G, Sumita E. Subword-based tagging by conditional random fields for Chinese word segmentation. In {\it Proc. Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL - 2006)}, New York, 2006, pp.193--196.
    [19]
    } Zhou G D. A chunking strategy towards unknown word detection in Chinese word segmentation. In {\it Proc. the 2nd International Joint Conference on Natural Language Processing (IJCNLP-2005)}, Dale R, Wong K F, Su J, Kwong O Y (eds.), Jeju Island, Korea, {\it Lecture Notes in Computer Science}, Vol. 3651. Springer, October 11--13, 2005, pp.530--541.
    [20]
    } Low J K, Ng H T, Guo W. A maximum entropy approach to Chinese word segmentation. In {\it Proc. the Fourth SIGHAN Workshop on Chinese Language Processing}, Jeju Island, Korea, October 14--15, 2005, pp.161--164.
    [21]
    } Zhao H, Huang C N, Li M, Lu B L. Effective tag set selection in Chinese word segmentation via conditional random field modeling. In {\it Proc. the 20th Asian Pacific Conference on Language}, {\it Information and Computation}, Wuhan, China, November 1--3, 2006, pp.87--94.
    [22]
    } Emerson T. The second international Chinese word segmentation bakeoff. In {\it Proc. the Fourth SIGHAN Workshop on Chinese Language Processing}, Jeju Island, Korea, October 14--15, 2005, pp.123--133.
    [23]
    } Asahara M, Fukuoka K, Azuma A, Goh C L, Watanabe Y, Matsumoto Y, Tsuzuki T. Combination of machine learning methods for optimum Chinese word segmentation. In {\it Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing}, Jeju Island, Korea, October 14--15, 2005, pp.134--137.
    [24]
    } Chen A, Zhou Y, Zhang A, Sun G, Unigram language model for Chinese word segmentation. In {\it Proc. the Fourth SIGHAN Workshop on Chinese Language Processing}, Jeju Island, Korea, October 14--15, 2005, pp.138--141.
  • Cited by

    Periodical cited type(5)

    1. Xiaobin He, Xin Chen, Heng Guo, et al. Scalability and efficiency challenges for the exascale supercomputing system: practice of a parallel supporting environment on the Sunway exascale prototype system. Frontiers of Information Technology & Electronic Engineering, 2023, 24(1): 41. DOI:10.1631/FITEE.2200412
    2. Xiaoyu Hao, Tao Fang, Junshi Chen, et al. swMPAS-A: Scaling MPAS-A to 39 Million Heterogeneous Cores on the New Generation Sunway Supercomputer. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(1): 141. DOI:10.1109/TPDS.2022.3215002
    3. Jia Wei, Mo Chen, Longxiang Wang, et al. Status, challenges and trends of data-intensive supercomputing. CCF Transactions on High Performance Computing, 2022, 4(2): 211. DOI:10.1007/s42514-022-00109-9
    4. Qi Chen, Shaonan Ma, Kang Chen, et al. SeqDLM: A Sequencer-Based Distributed Lock Manager for Efficient Shared File Access in a Parallel File System. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41404.2022.00060
    5. Xiaobin He, Wei Xiao, Xun Deng, et al. DFBuffer: High-performance data forwarding software optimized for single-process I/O scenarios. 2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS), DOI:10.1109/ICPADS56603.2022.00074

    Other cited types(0)

Catalog

    Article views (13) PDF downloads (1480) Cited by(5)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return