KnoE:一种验证语义一致性的Web挖掘工具
KnoE: A Web Mining Tool to Validate Previously Discovered Semantic Correspondences
-
摘要: 模式或本体匹配的问题需要提供两种模型里众多的相应实体,这两种模型属于同一个领域、且是分别开发好的.现在已有很多解决这问题的工具,但是匹配问题的复杂性使得现有技术和工具的实际效果并不令人满意.另外,最近出现了谷歌相似距离技术,它的目的是用谷歌搜索引擎从Web上挖掘知识,以语义地比较文本表达.我们开发了一个软件,用来验证利用这种相似度距离的模式或本体匹配工具所发现的结果.此外,我们不但对谷歌感兴趣,对其它利用相似度距离的主流搜索引擎也都感兴趣.我们发现了三点:第一,一些搜索引擎可较好地验证语义一致性;第二,搜索引擎之间的表现有很大差异;第三,组合几种搜索引擎能获得最佳结果.Abstract: The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of techniques and tools for addressing this problem, however, the complex nature of the matching problem make existing solutions for real situations not fully satisfactory. The Google Similarity Distance has appeared recently. Its purpose is to mine knowledge from the Web using the Google search engine in order to semantically compare text expressions. Our work consists of developing a software application for validating results discovered by schema and ontology matching tools using the philosophy behind this distance. Moreover, we are interested in using not only Google, but other popular search engines with this similarity distance. The results reveal three main facts. Firstly, some web search engines can help us to validate semantic correspondences satisfactorily. Secondly there are significant differences among the web search engines. And thirdly the best results are obtained when using combinations of the web search engines that we have studied.
下载: