Data Integration

Large-Scale Data Integration

Many Semantic Web applications require the integration of data from distributed and autonomous RDF data sources. However, the values in the RDF triples would be frequently recorded simply as the literal, and additional contextual information such as unit and format is often omitted, relying on consistent understanding of the context. In the wider context of the Web, it is generally not safe to make this assumption. The Context Interchange strategy provides a systematic approach for mediated data access in which semantic conflicts among heterogeneous data sources are automatically detected and reconciled by a context mediator.

Demo and manuals (Integration.rar) can be download from (Online disk)


  • 提供同时解决模式映射和数据上下文语义冲突消解的框架。

  • 框架适合大规模数据集成任务和应用。

  • 采用RDF数据模型进行异质异构多源数据的集成(研究表明数据集成效率较关系型数据模型提高75%以上)。

  • 支持数据源的动态加入和更新。

  • 也可用于统一数据清洗和转换。


Reference / 参考文献

  • Xiaoqing Zheng, Stuart E. Madnick, Xitong Li. SPARQL Query Mediation over RDF Data Sources with Disparate Contexts. Linked Data on the Web Workshop, Int. Conference on World Wide Web (WWW/LDOW’12), 2012.

  • Xiaoqing Zheng, Xitong Li, Stuart E. Madnick. SPARQL query mediation for data integration. 21st Workshop on Information Technologies and Systems (WITS’11), in conjunction with Int. Conference on Information Systems (ICIS’11), 2011.