DateStamps

From oaibp

Jump to: navigation, search

Main Page >> Data Provider Implementations

[edit] Datestamps

[edit] Protocol Definition

The purpose of datestamps in the OAI-PMH is to support incremental or selective harvesting. Each metadata record in a repository must have a datestamp. If more than one record is available from an item (perhaps a Dublin Core record and a MARC21 record) then the datestamps may change independently. The datestamp appears in the <header> of an OAI record.

Datestamps in the OAI-PMH must follow one of two specific formats. These are the "Complete date" and "Complete date plus hours, minutes and seconds" formats of ISO8601:

  • day granularity: YYYY-MM-DD
  • seconds granularity: YYYY-MM-DDThh:mm:ssZ

where YYYY is the four digit year, MM is the month (01 for January, etc), DD is the day, hh is the hour, mm is the minutes and ss is the seconds.

The granularity must be used consistently for all records within a repository and must be declared in the <granularity> element of the Identify response.

Datestamps in day granularity do not have a timezone designation. The boundaries between days are considered to occur in UTC. Datestamps in seconds granularity must have the timezone designation "Z", indicating UTC. As of September 2004 the OAI-PMH schema enforces the "Z" notation for UTC datastamps with seconds granularity. Before then, responses with other timezone designations would be schema valid even if not correct according to the protocol specification.

See the protocol description of selective harvesting and datestamps: http://www.openarchives.org/OAI/openarchivesprotocol.html#SelectiveHarvestingandDatestamps

See the protocol on UTC datetime: http://www.openarchives.org/OAI/openarchivesprotocol.html#Dates

[edit] Best Practices for Datestamps

Datestamps allow a service provider to keep an up-to-date copy of metadata from a repository by periodically harvesting only those records that have changed since a particular date and time. Such incremental harvesting addresses a scalability issue by providing an alternative to completely reharvesting metadata from a repository. Datestamps are not to be confused with the dates that may be included within metadata records, for example within the <dc:date> element of a Dublin Core record.

It is a best practice for data providers to include accurate and updated datestamps in their OAI repository. Only changes to the underlying item that have no effect on the OAI record should go unrecorded. If a service provider is performing incremental harvests, updated, added, and deleted records will only be harvested if their datestamps accurately reflect the time the record was added, updated, or deleted in the OAI repository.

It is best practice for the datestamp to reflect the date and time at which a record change was actually made available from the OAI repository. If, for example, a particular institution edits the metadata items on 2005-08-10, and then makes available the OAI records on the next day (2005-08-11), the datestamp should correspond to 2005-08-11 rather than 2005-08-10. This allows a service provider to accurately harvest changed records. Datestamps must never be 'backdated' because that might result in the change being missed by an incremental harvest.

As noted above, repositories may record datestamps with either date, or date and time, precision. This is referred to as the datestamp "granularity". The granularity must be used consistently for all records within a repository and must be declared in the <granularity> element of the Identify response. To avoid problems with comparison of datestamps from around the world, they must always be specified in UTC (Coordinated Universal Time) (see also the protocol). It is best practice that repositories use seconds granularity where practical. This allows a service provider to incrementally harvest to the finest specificity.

Personal tools