RepositoryLifecycle
From oaibp
Main Page >> Data Provider Implementations
Contents |
[edit] Managing the Repository Lifecycle
Managing the lifecycle of an OAI repository is an important part of a data provider's responsibilities. The most visible part of the lifecycle maintenance is to register the data provider with the Open Archives Initiative, but data providers also need to pay attention to other potential changes in the repository (for example, a change to all OAI identifiers in a repository) and the impact of these changes on service providers.
Note: The terms 'repository' and 'data provider' are often used interchangeably, although technically the repository refers to the actual server which can process the OAI verbs and the data provider is the entity responsible for the repository.
[edit] Repository Conformance and Registration
It is a best practice to register an OAI repository with the official OAI registry of data providers. The registry itself is here: http://www.openarchives.org/Register/BrowseSites and the registration site is here: http://www.openarchives.org/data/registerasprovider.html.
The primary benefits of registering a data provider in the official OAI registry are:
- conformance testing to ensure that the data provider meets the OAI specifications;
- once passed, periodic retesting of your repository for conformance;
- publicity of the availability of the repository for harvest; and
- availability for inclusion in other OAI repository registries, including the OAI Registry at UIUC which is the most comprehensive list of OAI data providers currently available. The UIUC registry picks up new data providers on the official OAI site monthly.
Conformance testing is particularly important. If an OAI repository does not pass the conformance testing, it is likely that service providers will have difficulty in harvesting your metadata. Note that the conformance testing offered through the official OAI registry is limited to required pieces of the protocol and does not check the best practices published here.
It is possible to go through the conformance testing and not register your OAI repository. This is a good option for data providers who wish to test their implementations before registering them; or for data providers who are not interested in widely advertising their OAI repository.
[edit] Repository Maintenance
An OAI repository requires a certain level of maintenance in order to guarantee that the information provided is up-to-date. Information included in the Identify response or any of the descriptive containers at the repository or set level should be kept up to date. It is particularly important that the email of the current contact person should be included in the Identify response.
Data providers should make an effort to know who is regularly harvesting their OAI repository. This can be done through checking server logs. Having this information can ease communication issues particularly when there are substantial changes to or downtime for an OAI repository.
If an OAI repository is going to be unavailable for a certain period of time, this should be communicated to the service providers that regularly harvest that repository. A human readable page should also be published with information about how long the repository is likely to be down and contact information for the repository administrator. The repository URL should not lead to a 404 error. The information that the repository will be / is off-line should be mentioned in the repository Identify response, appropriate error messages, and/or a human readable page description.
If the OAI repository's location (baseURL) changes, a redirection mechanism, appropriate harvesting error message, and/or human readable page can help ensure that service providers are informed. Service providers who regularly harvest the OAI repository should be informed directly. The data provider should also ensure to re-register the repository at the OAI site and that all other registries are aware of the updated repository baseURL.
If there are other major changes to the OAI repository (e.g. reorganization of sets, a change in the OAI identifiers used) that will how records are harvested (particularly in incremental harvests), these changes should be communicated with the service providers that regularly harvest the repository.
It can also be useful for data providers to communicate other information about their repository such as the size of the repository, how often the repository changes in terms of number of records and/or sets (see also Sets documentation), and information about rights over the metadata. This information should be communicated with service providers that regularly harvest the repository. This information can also be included in the <description> containers for both the repository and the set. Generally, any information that might have an impact on service provider routines should be communicated, either through direct contact or through repository / set descriptions.
[edit] Repository End of Life
A repository's end of life might be caused by two different situations:
- all of the resources described by the metadata are no longer available or exist
- the data provider can not or is not willing to maintain the OAI repository
In the first case (resources described are no longer available or no longer exist), a data provider should alert service providers that the repository will be taken down, and that service providers should purge metadata included in their aggregations. This can be done through direct contact with service providers and announcements on OAI listservs (such as oai-general and oai-implementers). If the OAI repository supports deleted records, all records should be marked deleted. The repository URL should not lead to a 404 error, but should indicate that the repository has been taken down.
In the second case, if the organization maintaining the OAI repository is no longer able or willing to do this, the data provider should again alert service providers that the repository will be taken down, and that service providers should purge metadata included in their aggregations. However, data providers might also investigate alternative options particularly if the metadata items exposed are relatively static. If the resources described will still be available and will not change, an alternate institution could volunteer to maintain the technical infrastructure of the repository. This could be done through a Static Repository Gateway or a Celestial OAI cache, for example. If an alternate institution is willing to accept the responsibility of the technical maintenance of the repository, then the data provider should ensure that he/she communicates that information as widely as possible.
