Augmented Metadata and Annotations
From Metadata-Registry
Outline--a harvest scenario
- We publish two metadata formats designed to support annotations and augmented metadata
- The key feature of these formats is two metadata elements, one (and only one) of which must be present, either of which may be repeated:
- xxxxUniqueIdRef? -- which contains a reference to an existing metadata record in the MR and must match an existing record
- dcIdentifierRef? -- which contains a reference to a URI that may or may not exist in the MR. #*Augmented metadata must reference an existing URI in the MR. This could also be expressed as <reference type=xxxxUniqueId?> or <reference type=dcIdentifier>
- These are intended to be used to supply annotations and augmented metadata for harvest via OAI and perhaps a services interface.
- The key feature of these formats is two metadata elements, one (and only one) of which must be present, either of which may be repeated:
- Annotation and augmentation suppliers wishing to supply metadata about a resource identified by a URI should first query or harvest the MR to get a list of metadata records that are about that URI.
- They create metadata about their annotation in the above format and serve it via OAI. This record may carry the actual annotation or it can simply contain a reference. In the case of metadata augmentation, each record served should be a self-contained, incomplete metadata record and should not reference another source of metadata.
- We harvest the records through a standard harvest -- all incoming records will have to be associated with a collection record
- The ingest process creates a unique mrec record for each incoming record
- References in the MR must always be mrec_ids so in the case of dcIdentifierRef? the ingest process retrieves all mrecs that reference each dcIdentifierRef?.
- If a dcIdentifierRef? references a URI that is not found, an mrec record is created for that URI and is queued for metadata generation by iVia (controversial)
- An entry is created in the link table for each mrec identifed either directly or by reference. This will contain the mrec_id of the annotation record, the mrec_id of the mrec being annotated, a reference type, a datestamp, and a source mrec_id
- Note that the link table will need an additional 'source' field that will, in the case ofannotations and augmentations, contain the mrec_id of the annotation or augmenation metadata record that supplied the link.
- Note also that reference type and datestamp are denormalized values that can be determined by reference to the source mrec_id if necessary.
- Output of augmented metadata is the tough thing -- it needs to be served both as a component part of the metadata format being augmented and as a distinct format, both within and without the mudball.
[NOTE: Pull Augmentation Use Cases for this section]