We use cookies to improve your experience with our site.
Xue Wang, Xuan Zhou, Shan Wang. Summarizing Large-Scale Database Schema Using Community Detection[J]. Journal of Computer Science and Technology, 2012, 27(3): 515-526. DOI: 10.1007/s11390-012-1240-1
Citation: Xue Wang, Xuan Zhou, Shan Wang. Summarizing Large-Scale Database Schema Using Community Detection[J]. Journal of Computer Science and Technology, 2012, 27(3): 515-526. DOI: 10.1007/s11390-012-1240-1

Summarizing Large-Scale Database Schema Using Community Detection

  • Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separate these tables into clusters that represent different topics. Moreover, as a schema can be very big, the schema summary needs to be structured into multiple levels, to further improve the usability. In this paper, we introduce a new schema summarization approach utilizing the techniques of community detection in social networks. Our approach contains three steps. First, we use a community detection algorithm to divide a database schema into subject groups, each representing a specific subject. Second, we cluster the subject groups into abstract domains to form a multi-level navigation structure. Third, we discover representative tables in each cluster to label the schema summary. We evaluate our approach on Freebase, a real world large-scale database. The results show that our approach can identify subject groups precisely. The generated abstract schema layers are very helpful for users to explore database.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return