We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Wei Hu, Yu-Zhong Qu, Xing-Zhi Sun. Bootstrapping Object Coreferencing on the Semantic Web[J]. Journal of Computer Science and Technology, 2011, 26(4): 663-675. DOI: 10.1007/s11390-011-1166-z
Citation: Wei Hu, Yu-Zhong Qu, Xing-Zhi Sun. Bootstrapping Object Coreferencing on the Semantic Web[J]. Journal of Computer Science and Technology, 2011, 26(4): 663-675. DOI: 10.1007/s11390-011-1166-z

Bootstrapping Object Coreferencing on the Semantic Web

Funds: This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 61003018 and 60973024, in part by the National Research Foundation for the Doctoral Program of Higher Education of China under Grant No. 20100091120041, and also in part by the IBM CRL UR Joint Project.
More Information
  • Received Date: February 14, 2011
  • Revised Date: June 06, 2011
  • Published Date: July 04, 2011
  • An object on the Semantic Web is likely to be denoted with several URIs by different parties. Object core-ferencing is a process to identify "equivalent" URIs of objects for achieving a better Data Web. In this paper, we propose a bootstrapping approach for object coreferencing on the Semantic Web. For an object URI, we firstly establish a kernel that consists of semantically equivalent URIs from the same-as, (inverse) functional properties and (max-)cardinalities, and then extend the kernel with respect to the textual descriptions (e.g., labels and local names) of URIs. We also propose a trustworthiness-based method to rank the coreferent URIs in the kernel as well as a similarity-based method for ranking the URIs in the extension of the kernel. We implement the proposed approach, called ObjectCoref, on a large-scale dataset that contains 76 million URIs collected by the Falcons search engine until 2008. The evaluation on precision, relative recall and response time demonstrates the feasibility of our approach. Additionally, we apply the proposed approach to investigate the popularity of the URI alias phenomenon on the current Semantic Web.
  • [1]
    Hogan A, Harth A, Decker S. Performing object consolidation on thesemantic web data graph. In Proc. WWW Workshop on I3: Identity,Identifiers, Identification, Banff, Canada, May 8, 2007.
    [2]
    Jacobs I, Walsh N. Architecture of the World Wide Web, volume one.http://www.w3.org/TR/webarch/, Dec.15, 2004.
    [3]
    Bleiholder J, Naumann F. Data fusion. ACM Computing Surveys,2008, 41(1): 1-41.
    [4]
    Glaser H, Jaffri A, Millard I C. Managing co-reference on theSemantic Web. In WWW Workshop on LDOW, Madrid, Spain, Apr.20, 2009.
    [5]
    Bizer C, Heath T, Berners-Lee T. Linked data -- The story so far. International Journal on Semantic Web and Information Systems,2009, 5(3): 1-22.
    [6]
    Volz R, Kleb J, Mueller W. Towards ontology-based disambiguation ofgeographical identifiers. In Proc. WWW Workshop on I3:Identity, Identifiers, Identification, Banff, Canada, May 8, 2007.
    [7]
    Raimond Y, Sutton C, Sandler M. Automatic interlinking of musicdatasets on the Semantic Web. In WWW Workshop on LDOW,Beijing, China, Apr.22, 2008.
    [8]
    Hassanzadeh O, Consens M. Linked movie data base. In WWWWorkshop on LDOW, Madrid, Spain, Apr.20, 2009.
    [9]
    Tummarello G, Delbru R, Oren E. Sindice.com: Weaving the openlinked data. In Proc. ISWC/ASWC, Busan, Korea, Nov.11-15, 2007, pp.552-565.
    [10]
    Cheng G, Qu Y Z. Searching linked objects with Falcons: Approach,implementation and evaluation. International Journal onSemantic Web and Information Systems, 2009, 5(3): 49-70.
    [11]
    Bouquet P, Stoermer H, Niederee C, Mana A. Entity name system:The back-bone of an open and scalable web of data. In Proc.IEEE ICSC, Washington DC, USA, Aug.4-7, 2008, pp.554-561.
    [12]
    Hogan A, Polleres A, Umbrich J, Zimmermann A. Some entities are moreequal than others: Statistical methods to consolidate linked data.In ESWC Workshop on NeFoRS, Heraklion, Greece, May 31, 2010.
    [13]
    Elmagarmid A K, Ipeirotis P G, Verykios V S. Duplicate recorddetection: A survey. IEEE Transactions on Know-ledge and DataEngineering, 2007, 19(1): 1-16.
    [14]
    Wang S, Du X Y, Meng X F, Chen H. Database research: Achievementsand challenges. Journal of Computer Science and Technology,2006, 21(5): 823-837.
    [15]
    Li Y, Musilek P, Reformat M, Wyard-Scott L. Identification ofpleonastic it using the web. Journal of Artificial IntelligenceResearch, 2009, 34(1): 339-389.
    [16]
    Dean M, Schreiber G. OWL web ontology language reference.http://www.w3.org/TR/owl-ref/, Feb.10, 2004.
    [17]
    Nikolov A, Uren V, Motta E, de Roeck A. Overcoming schemaheterogeneity between linked semantic repositories to improvecoreference resolution. In Proc. ASWC, Shanghai, China, Dec.6-9,2009, pp.332-346.
    [18]
    Qu Y Z, Hu W, Cheng G. Constructing virtual documents forontology matching. In Proc. WWW, Edinburgh, UK, May 23-26, 2006,pp.23-31.
    [19]
    Hu W, Qu Y Z, Cheng G. Matching large ontologies: Adivide-and-conquer approach. Data and Knowledge Engineering,2008, 67(1): 140-160.
    [20]
    Ferrara A, Lorusso D, Montanelli S. Automatic identity recognitionin the Semantic Web. In Proc. ESWC Workshop on IRSW, Tenerife,Spain, Jun.2, 2008.
    [21]
    Volz J, Bizer C, Gaedke M, Kobilarov G. Discovering and maintaininglinks on the web of data. In Proc. ISWC, Chantilly, USA, Oct.25-29,2009, pp.650-665.
    [22]
    Halpin P, Hayes P J, McCusker J P, McGuinness D L, Thompson H S.When owl:sameAs isn't the same: An analysis of identity in linkeddata. In Proc. ISWC, Shanghai, China, Nov.7-11, 2010, pp.305-320.
    [23]
    Ding L, Shinavier J, Shangguan Z N, McGuinness D L. SameAsnetworks and beyond: Analyzing deployment status and implications ofowl:sameAs in linked data. In Proc. ISWC, Shanghai, China,Nov.7-11, 2010, pp.145-160.
    [24]
    Gracia J, d'Aquin M, Mena E. Large scale integration of senses forthe Semantic Web. In Proc. WWW, Madrid, Spain, Apr.20-24,2009, pp.611-620.
    [25]
    Fellegi I P, Sunter A B. A theory for record linkage. Journalof the American Statistical Society, 1969, 64(328): 1183-1210.
    [26]
    Cheng T Y, Wang S. A novel approach to clustering merchandiserecords. Journal of Computer Science and Technology, 2007,22(2): 228-231.
    [27]
    Euzenat J, Shvaiko P. Ontology Matching. Heidelberg: Springer,2007.
    [28]
    Wang S, Englebienne G, Schlobach S. Learning concept mappingsfrom instance similarity. In Proc. ISWC, Karlsruhe, Germany, Oct.26-30,2008, pp.339-355.
    [29]
    Klyne G, Carroll J J. Resource description framework (RDF): Conceptsand abstract syntax. http://www.w3.org/TR/rdf-concepts/, Feb.10, 2004.
    [30]
    Urbani J, Kotoulas S, Maassen J, van Harmelen F, Bal H. OWLreasoning with WebPIE: Calculating the closure of 100 billiontriples. In Proc. ESWC, Heraklion, Greece, May 30-Jun.3,2010, pp.213-227.
    [31]
    Hogan A, Pan J Z, Polleres A, Decker S. SAOR: Template ruleoptimisations for distributed reasoning over 1 billion linked datatriples. In: Proc. ISWC, Shanghai, China, Nov.7-11, 2010, pp.337-353.
    [32]
    Ghazvinian A, Noy N F, Jonquet C, Shah N, Musen M A. What fourmillion mappings can tell you about two hundred ontologies. In Proc. ISWC, Chantilly, USA, Oct.25-29, 2009, pp.229-242.
    [33]
    Page L, Brin S, Motwani R, Winograd T. The PageRank citationranking: Bringing order to the web. {Technical Report, StanfordUniversity, 1998.
    [34]
    Kleinberg J. Authoritative sources in a hyperlinked environment. In Proc. SODA, San Francisco, USA, Jan.25-27, 1998, pp.668-677.
    [35]
    Tummarello G, Morbidoni C, Bachmann-Gmür R, Erling O. RDFSync:Efficient remote synchronization of RDF models. In Proc.ISWC/ASWC, Busan, Korea, Nov.11-15, 2007, pp.537-551.
    [36]
    Stickler P. CBD -- Concise bounded description.http:// www.w3.org/Submission/CBD/, Jun.3, 2005.

Catalog

    Article views (33) PDF downloads (2336) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return