University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Semi Automated Transformation to OWL Formatted Files as an Approach to Data Integration. A Feasibility Study Using Environmental, Disease Register and Primary Care Clinical Data

Liang, SF, Taweel A, , Miles, S, Kovalchuk, Y, Spiridou, A, Barratt, B, Hoang, U, Crichton, S, Delaney, BC and Wolfe, C (2015) Semi Automated Transformation to OWL Formatted Files as an Approach to Data Integration. A Feasibility Study Using Environmental, Disease Register and Primary Care Clinical Data Methods of Information in Medicine, 54 (1). pp. 32-40.

Full text not available from this repository.

Abstract

Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on “Managing Interoperability and Complexity in Health Systems”. Background: Data heterogeneity is one of the critical problems in analysing, reusing, sharing or linking datasets. Metadata, whilst adding semantic description to data, adds an additional layer of complexity in the heterogeneity of metadata descriptors themselves. This can be managed by using a pre-defined model to extract the metadata, but this can reduce the richness of the data extracted. Objectives: to link the South London Stroke Register (SLSR), the London Air Pollution toolkit (LAP) and the Clinical Practice Research Datalink (CPRD) while transforming data into the Web Ontology Language (OWL) format. Methods: We used a four-step transformation approach to prepare meta-descriptions, convert data, generate and update meta-classes and generate OWL files. We validated the correctness of the transformed OWL files by issuing queries and assessing results against the original source data. Results: We have transformed SLSR LAP and CPRD into OWL format. The linked SLSR and CPRD OWL file contains 3644 male and 3551 female patients. The linked SLSR and LAP OWL file shows that there are 17 out of 35 outward postcode areas, where no overlapping data can support further analysis between SLSR and LAP. Conclusions: Our approach generated a resultant set of transformed OWL formatted files, which are in a query-able format to run individual queries, or can be easily converted into other more suitable formats for further analysis, and the transformation was faithful with no loss or anomalies. Our results have shown that the proposed method provides a promising general approach to address data heterogeneity.

Item Type: Article
Subjects : Medical Science
Authors :
NameEmailORCID
Liang, SFUNSPECIFIEDUNSPECIFIED
Taweel A, UNSPECIFIEDUNSPECIFIED
Miles, SUNSPECIFIEDUNSPECIFIED
Kovalchuk, YUNSPECIFIEDUNSPECIFIED
Spiridou, AUNSPECIFIEDUNSPECIFIED
Barratt, BUNSPECIFIEDUNSPECIFIED
Hoang, Uu.hoang@surrey.ac.ukUNSPECIFIED
Crichton, SUNSPECIFIEDUNSPECIFIED
Delaney, BCUNSPECIFIEDUNSPECIFIED
Wolfe, CUNSPECIFIEDUNSPECIFIED
Date : January 2015
Identification Number : https://doi.org/10.3414/ME13-02-0029
Copyright Disclaimer : © Schattauer 2015
Uncontrolled Keywords : Semantics, Informatics, Knowledge, Data linkage, OWL ontology
Depositing User : Symplectic Elements
Date Deposited : 17 May 2017 10:47
Last Modified : 18 May 2017 12:44
URI: http://epubs.surrey.ac.uk/id/eprint/829238

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800