Extracting Local Schema from Semistructured Data Based on Graph-Oriented Semantic Model
-
Abstract
Many modern applications (e-commerce, digital library, etc.) requireintegrated access to various information sources (from traditional RDBMSto semistructured Web repositories). Extracting schema fromsemistructured data is a prerequisite to integrate heterogeneousinformation sources. The traditional method that extractsglobal schema may require time (and space) to increase exponentially withthe number of objects and edges in the source. A new method is presentedin this paper, which is about extracting local schema. In this method, thealgorithm controls the scale of extracting schema within the "schemadiameter'' by examining the semantic distance of the target set andusing the Hash class and its path distance operation. This method isvery efficient for restraining schema from expanding. The prototypevalidates the new approach.
-
-