We use cookies to improve your experience with our site.

一种计划多概念模式集成过程的聚类算法

A Clustering Algorithm for Planning the Integration Process of a Large Number of Conceptual Schemas

  • 摘要: 当整合过程涉及上百个甚至上千个模式时,需要相应的标准以选择要集成的模式簇,以便通过有效的迭代过程处理整合问题.簇里的模式应该是按照基于模式间相似度和相异性的内聚和耦合标准而挑选的.本文为关联聚类方法的变形提出了一种算法,以协助设计者集成大量的概念模式.该变形引入了每个簇中模式数量的上下限,以分别避免过于复杂和过于简单的整合环境.本文给出了一种解决该问题(是一个NP难的组合问题)的启发式规则.一项实验表明相对由专家人工定义的方法,使用该算法得到簇的模式集成过程的有效性得到了明显增加.

     

    Abstract: When tens and even hundreds of schemas are involved in the integration process, criteria are needed for choosing clusters of schemas to be integrated, so as to deal with the integration problem through an efficient iterative process. Schemas in clusters should be chosen according to cohesion and coupling criteria that are based on similarities and dissimilarities among schemas. In this paper, we propose an algorithm for a novel variant of the correlation clustering approach that addresses the problem of assisting a designer in integrating a large number of conceptual schemas. The novel variant introduces upper and lower bounds to the number of schemas in each cluster, in order to avoid too complex and too simple integration contexts respectively. We give a heuristic for solving the problem, being an NP hard combinatorial problem. An experimental activity demonstrates an appreciable increment in the effectiveness of the schema integration process when clusters are computed by means of the proposed algorithm w.r.t. the ones manually defined by an expert.

     

/

返回文章
返回