Document Type
Presentation
Publication Date
8-1-2014
Abstract
The management of provenance metadata is a pressing issue for high profile, complex, science projects needing to trace their data products’ lineage in order to withstand scrutiny. To represent, capture, transfer, store and deliver provenance data from a project’s processes, specialized metadata, new IT system components and the human and automated procedures are necessary. The collection of metadata, components and procedures can be termed a provenance methodolo-gy and architecture. Through our involvement with several large Australian science projects ([4], [5], [6], [7], [11]), we have developed a methodology that provides: Use Case assessments of project clients’ requirements for provenance; team structures and project processes to facilitate provenance requirements; systems’ behaviour to capture provenance from automated processes; behavioural patterns for project staff to capture provenance from manual processes; procedures for process compiling, storing and using provenance records. Semantic web provenance ontologies have been created ([1], [2], [3]) that allow generic, ab-stracted provenance representation and we have extended the PROV ontology through our prov-enance data management ontology (PROMS-O) [8] in order to address provenance Use Cases required by our projects that PROV-O does not address. Due to our project experience, we have developed a provenance architecture that specifies: a single provenance representation format for all project processes; the use of a persistent ID systems to alias other systems’ URIs; an archival systems to store data and provide access to versions of their data via URIs; provenance management systems to store and provide access to provenance data; provenance exporters to capture and transmit provenance data from automated systems; provenance procedures to collect provenance data from human processes, and; an overarching integration architecture. In this paper, we briefly mention our work regarding each of the points above which, together, provide a range of pointers to projects wanting to embark on provenance management.
Comments
Session R04, Data Management and Brokering