Mix-Lingual Relation Extraction: Dataset and a Training Approach

Ling-Xing Kong; You-Gang Chu; Zheng Ma; Jian-Bing Zhang; Jia-Jun Chen

doi:10.1007/s11390-024-4314-y

Kong LX, Chu YG, Ma Z et al. Mix-lingual relation extraction: Dataset and a training approach. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY, 40(1): 42−59, Jan. 2025. DOI: 10.1007/s11390-024-4314-y

Citation:

Kong LX, Chu YG, Ma Z et al. Mix-lingual relation extraction: Dataset and a training approach. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY, 40(1): 42−59, Jan. 2025. DOI: 10.1007/s11390-024-4314-y

Citation:

Kong LX, Chu YG, Ma Z et al. Mix-lingual relation extraction: Dataset and a training approach. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY, 40(1): 42−59, Jan. 2025. DOI: 10.1007/s11390-024-4314-y

Mix-Lingual Relation Extraction: Dataset and a Training Approach

Abstract

Abstract

Relation extraction is a pivotal task within the field of natural language processing, boasting numerous real-world applications. Existing research predominantly centers on monolingual relation extraction or cross-lingual enhancement for relation extraction. However, there exists a notable gap in understanding relation extraction within mix-lingual (or code-switching) scenarios. In these scenarios, individuals blend content from different languages within sentences, generating mix-lingual content. The effectiveness of existing relation extraction models in such scenarios remains largely unexplored due to the absence of dedicated datasets. To address this gap, we introduce the Mix-Lingual Relation Extraction (MixRE) task and construct a human-annotated dataset MixRED to support this task. Additionally, we propose a hierarchical training approach for the mix-lingual scenario named Mix-Lingual Training (MixTrain), designed to enhance the performance of large language models (LLMs) when capturing relational dependencies from mix-lingual content spanning different semantic levels. Our experiments involve evaluating state-of-the-art supervised models and LLMs on the constructed dataset, with results indicating that MixTrain notably improves model performance. Moreover, we investigate the effectiveness of using mix-lingual content as a tool to transfer learned relational dependencies across different languages. Additionally, we delve into factors influencing model performance for both supervised models and LLMs in the novel MixRE task.

FullText(HTML)

References (42)

Relative Articles

Supplements (5)

Cited By

Mix-Lingual Relation Extraction: Dataset and a Training Approach

Abstract

Catalog

Export File

Citation

Format

Content