Difference between revisions of "Conceptual Framework"

From Metadata-Registry
Jump to: navigation, search
(Metadata Repository Conceptual Framework)
Line 1: Line 1:
===Metadata Repository Conceptual Framework===
+
===Metadata Management Conceptual Framework===
 
[Note: This section will be fixed to include: overall picture of an MMS, OAI-glue, underlying repository interacting with other data stores, harvest-shred-recompilation of metadata taking advantage of other services, etc. Individual components will be moved to compontents and functionality section. Include what kind of problems does this solve, both for aggregations and libraries running digital services.]
 
[Note: This section will be fixed to include: overall picture of an MMS, OAI-glue, underlying repository interacting with other data stores, harvest-shred-recompilation of metadata taking advantage of other services, etc. Individual components will be moved to compontents and functionality section. Include what kind of problems does this solve, both for aggregations and libraries running digital services.]
 +
 +
Currently central data stores associated with institutional repositories and other metadata aggregators face a number of challenges: data is often stored in remote locations, in incompatible data formats and varying degrees of quality and must be aggregated, normalized, and integrated. Existing data is frequently highly variable in quality, often incomplete, and may not conform to existing standards.
 +
 +
The Metadata Management Services seek to solve a number of these problems by providing services to improve the quality of existing metadata, manage the aggregation of metadata, both from original metadata sources and quality improvement services, and provide services to redistribute the improved metadata in multiple formats.
 +
 +
The broad architectural framework involves the following functions:
 +
*Aggregating item-level metadata from multiple data sources, in multiple formats, using OAI as the data interchange glue, and storing them in a central repository that will provide a base for other services. These data sources and formats need not be equivalent or even necessarily compatible.
 +
*Providing harvest services to manage the routine repeat harvesting of metadata via OAI, including server registration
  
 
There is an Architecture/Philosophy at work that forms the basis of the following framework:
 
There is an Architecture/Philosophy at work that forms the basis of the following framework:

Revision as of 09:07, 25 October 2005

Metadata Management Conceptual Framework

[Note: This section will be fixed to include: overall picture of an MMS, OAI-glue, underlying repository interacting with other data stores, harvest-shred-recompilation of metadata taking advantage of other services, etc. Individual components will be moved to compontents and functionality section. Include what kind of problems does this solve, both for aggregations and libraries running digital services.]

Currently central data stores associated with institutional repositories and other metadata aggregators face a number of challenges: data is often stored in remote locations, in incompatible data formats and varying degrees of quality and must be aggregated, normalized, and integrated. Existing data is frequently highly variable in quality, often incomplete, and may not conform to existing standards.

The Metadata Management Services seek to solve a number of these problems by providing services to improve the quality of existing metadata, manage the aggregation of metadata, both from original metadata sources and quality improvement services, and provide services to redistribute the improved metadata in multiple formats.

The broad architectural framework involves the following functions:

  • Aggregating item-level metadata from multiple data sources, in multiple formats, using OAI as the data interchange glue, and storing them in a central repository that will provide a base for other services. These data sources and formats need not be equivalent or even necessarily compatible.
  • Providing harvest services to manage the routine repeat harvesting of metadata via OAI, including server registration

There is an Architecture/Philosophy at work that forms the basis of the following framework:

  1. A Provider/Management service provides a user interface and a data storage space for providers to register their OAI servers with the aggregator, supply and maintain data necessary to the HarvestService, and provide metadata describing their service. This service creates two XML document types for further processing:
    • HarvestTrigger documents that contain all information necessary to initiate an OAI harvest from a metadata provider.
    • CollectionRecord documents that contain descriptive metadata about a collection of metadata being provided by a metadata provider.
  2. A HarvestService that processes incoming HarvestTrigger documents in order to initiate an OAI harvest and then optionally further processes the resulting harvested documents. The HarvestService utilizes the concept of WatchedFolders and an extensible ProcessingHarness. HarvestTrigger documents may arrive in the WatchedFolders from many sources via many methods and will be processed as long as they contain a valid set of instructions. This service produces the following documents for further processing:
    • HarvestMerge documents that contain information from the HarvestTrigger document that initiated the harvest and may also contain the results of combining all metadata documents created by the OAI harvest process.
    • Emailed server and metadata validation responses may also be optionally produced
  3. A MetadataCrosswalk service processes incoming HarvestMerge documents to create the desired metadata format. This is most likely a qualified Dublin Core metadata format created and maintained for use by an assortment of services. By default, the desired QDC is created from the required oai_dc metadata format that must be provided by every OAI-PMH server, but it may also be crosswalked from any other "native" metadata format. The MetadataCrosswalk service:
    • validates the base utility of incoming metadata and either accepts or rejects it
    • utilizes "safe" XSLT transforms to correct minor imperfections where possible
    • utilizes either custom or default XSLT transforms provided by a MetadataQA service to create the desired QDC
    • produces as output a DbInsert document that is used by the MetadataIngest service of the Metadata Repository
  4. A MetadataQA service that produces provider-specific crosswalks in the form of XSLT scripts to create high quality QDC metadata from selected native metadata.
  5. MetadataIngest service that further validates and parses DbInsert documents and inserts the processed metadata into the Metadata Repository

The Metadata Repository stores all provided native metadata formats and QDC in an object/relational layer. From this layer, internal processes create static XML documents for use in efficiently serving metadata in multiple formats via OAI.