Feb. 10, 2006

From Metadata-Registry
Jump to: navigation, search

Agenda and Notes, 2/10/06

1. Discussed UIUC/IMLS project.

  • The problems with harvesting from UIUC: only Simple DC, no explicit connection between collections and items.
  • Diane will mess around with the data we have, and show Stuart and Diny how it looks and what we can do with it, including samples of CUL Metadata Services reports.
  • We should come up with questions for the UIUC crowd about what they'd be willing to do to help

GEM Metadata Management Needs

2. Discuss current state of GEM metadata feeds (see below)

  • GEM as hybrid (fairly tight federation of data providers (current consortium) and select open harvest (new)
  • Current feed integration underway of major federation collections using GEM XML 2.0 schema (alternative collection holder or GEMCat4 RDF/XML)
    • Federated data--minimal quality control and minimal initial augmentation
    • Harvested data--potentially high quality issues and initial augmentation
  • Current state of collection
    • Approximately 45,000 consortium member conformant GEM records
    • Approximately 5,000 non-consortium member OAI-harvested records
    • Two GEM schema versions--GEM 1.0 and GEM 2.0: difference is that 1.0 is pre-DC-refinements, so has lots of GEM refinements; 2.0 is much more conformant with DC, after DC instituted refinements
    • Variety of source encodings (OAI (minimal), GEM Syntax 0 (delimited ascii file), GEM DB-XML (first internal use of XML), GEM XML 2.0 (conformant with DCMI XML guidelines), RDF/XML (2.0))
      • Current integration underway (GEM XML 2.0 Schema)
    • Largest proportion (over 50%) of records in top 10 collections--attack first
    • Largest percentage of records (26,000) coming through GEM harvest of separate (non embedded) metadata records
      • While I am not yet certain of this fact, I think GEM actually controls the vast majority of these records (i.e., we harvest from ourself :-))
    • Only about 3,500 records harvested from resource-embedded metadata (HTML header)
    • Approximately

3. Discuss MMS integration (per Diane/Jon emails)