We use cookies to improve your experience with our site.
吴信东, 朱兴全, 陈琪君, 王飞跃. 基于交互式代理的普适数据挖掘系统[J]. 计算机科学技术学报, 2009, 24(6): 1018-1027.
引用本文: 吴信东, 朱兴全, 陈琪君, 王飞跃. 基于交互式代理的普适数据挖掘系统[J]. 计算机科学技术学报, 2009, 24(6): 1018-1027.
Xin-Dong Wu, Xing-Quan Zhu, Qi-Jun Chen, Fei-Yue Wang. Ubiquitous Mining with Interactive Data Mining Agents[J]. Journal of Computer Science and Technology, 2009, 24(6): 1018-1027.
Citation: Xin-Dong Wu, Xing-Quan Zhu, Qi-Jun Chen, Fei-Yue Wang. Ubiquitous Mining with Interactive Data Mining Agents[J]. Journal of Computer Science and Technology, 2009, 24(6): 1018-1027.

基于交互式代理的普适数据挖掘系统

Ubiquitous Mining with Interactive Data Mining Agents

  • 摘要: 数据挖掘致力于从大量收集的数据中提取隐藏的模式或知识,并用以辅助决策或者系统优化。由于硬件存储设备以及网络技术的发展,数据采集已经变得越来越容易,而随之引发的后续问题是如何采用有效的方法及时地消化数据并提取重要的知识或模式信息。这一领域的研究现在已经非常成功的应用到商业智能,生物信息学,国防安全,以及地球物理学等其它相关领域中。
    在实际应用中,数据挖掘涉及许多的算法和理论,各种方法都有各自的优势和局限性。完全了解所有的算法对于普通用户而言既不实际而且涉及的人力成本可开发周期也太长。在本文研究中,我们提出了一个基于交互式代理的普适数据挖掘系统。其核心研究内容是如何开发一个自主的数据挖掘代理来辅助用户在不需要任何编程知识的前提下快速的完成数据挖掘任务并得到实验结果。基于这一设计思想,我们提出了一个OIDM (Online Interactive Data Mining) 数据挖掘代理。该系统采用人机交互式的方式来收集用户信息并指导他们如何进行数据挖掘模型的自动选择(比如分类,聚类,及相关规则等)以及参数的调整等。在实际应用中,OIDM数据挖掘代理模型有四个主要的特点:(1)自主性:数据挖掘代理本身是一个高度自治的机体,他们可以单独实现完整的数据挖掘功能及结果的分析和提交。(2)交互性:数据挖掘的本身脱离不了人的干预和指导,每个数据挖掘代理本身是一个人机交互信息收集系统。(3)完整性:每个数据挖掘代理配置了比较完整的数据挖掘算法,所以它们能独立的完成基本的数据挖掘功能。(4)简洁性:数据挖掘的过程应当尽量少地涉及人的干预,同时数据挖掘结果应当简单明了并有利于后续分析。
    我们用Java平台实现了OIDM交互式代理数据挖掘系统。实验结果及用户反馈信息表明该代理系统能在很短的时间里辅助用户进行数据挖掘分析并得到预期的结果。该系统在实际应用中对非数据挖掘领域的用户特别有效。

     

    Abstract: Due to the increasing availability and sophistication of data recording techniques, multiple information sources and distributed computing are becoming the important trends of modern information systems. Many applications such as security informatics and social computing require a ubiquitous data analysis platform so that decisions can be made rapidly under distributed and dynamic system environments. Although data mining has now been popularly used to achieve such goals, building a data mining system is, however, a nontrivial task, which may require a complete understanding on numerous data mining techniques as well as solid programming skills. Employing agent techniques for data analysis thus becomes increasingly important, especially for users not familiar with engineering and computational sciences, to implement an effective ubiquitous mining platform. Such data mining agents should, in practice, be intelligent, complete, and compact. In this paper, we present an interactive data mining agent --- OIDM (online interactive data mining), which provides three categories (classification, association analysis, and clustering) of data mining tools, and interacts with the user to facilitate the mining process. The interactive mining is accomplished through interviewing the user about the data mining task to gain efficient and intelligent data mining control. OIDM can help users find appropriate mining algorithms, refine and compare the mining process, and finally achieve the best mining results. Such interactive data mining agent techniques provide alternative solutions to rapidly deploy data mining techniques to broader areas of data intelligence and knowledge informatics.

     

/

返回文章
返回