From oaibp

Jump to: navigation, search

Main Page >> Tools and Strategies for Using and Enhancing / Extending the OAI Protocol

[edit] Software Solutions and Packages

There are numerous ways to add OAI functionality to existing systems. More vendors are providing Open Source versions of their tools, giving many the option of adding an OAI module to their current system. This section focuses for the most part on non-commercial solutions for becoming an OAI data provider. It also includes tools that are available for service providers, such as harvesting software.

When selecting a tool, consider not only whether it meets minimum OAI requirements (simple or unqualified Dublin Core format) but also whether it includes other features for increasing the quality of shareable metadata. For instance, can the tool support multiple metadata formats? Does it support harvesting via sets (as opposed to harvesting the full repository)? Does it allow harvesting to be performed in batches using resumptionTokens? Is it possible to indicate to harvesters when a record has been deleted? The nature of the metadata to be shared (how many records, how often updated or added) also impacts decisions about which OAI data provider solution to choose. For more information about being a data provider, see Best Practices for OAI Data Provider Implementations.

[edit] Turnkey Solutions

  • ADLIB Archive
    • Presentation: ADLIB Archive is a system for managing collections in archives and records offices.
    • Features: Support for the OAI-PMH is included.
  • ArchivalWare (PTFS)
    • Presentation: ArchivalWare is a web-based, full-text search and retrieval system. It allows organizations to store, access, and manage digital archive collections within one system.
    • Features: ArchivalWare supports Dublin Core. It supports harvesting via sets and allows harvesting to be performed in batches using resumptionTokens.
    • Presentation: A good content management system for displaying images and multimedia content. CONTENTdm allows pricing according to the collection size, making it affordable for small collections.
    • Features (version 4.1): Handles Simple Dublin Core (oai_dc) and Qualified Dublin Core (qdc) formats. It handles deleted records, resumptionTokens, and sets. It can be configured to only allow harvesting of specific collections. The system allows compound objects but does not provide the ability to define datestamp granularity level.
  • Curator (formerly Encompass)
    • Presentation: Created by Endeavor Information Systems, Curator is a digital library system for academic and research libraries. Features include customizable metadata formats and support for SRU/SRW.
    • Features: The current version may be set up as a data provider, and records harvested via OAI may also be pulled into the system.
  • CWIS
    • Presentation: Collection Workflow Integration System (CWIS) is a turnkey open source software package developed by the Internet Scout Project as part of the NSDL initiative. Available for free, it includes a search engine, a recommender system, OAI and RSS servers, and more. Well adapted to working with smaller collections.
    • Features (version 1.3.1): Supports OAI sets, resumptionTokens, oai_dc and nsdl_dc, and the OAI-SQ extension for searching via OAI-PMH. Allows simultaneous querying and updating of records.
  • Digital Commons
    • Presentation: A commercial application from bepress, marketed by UMI.
    • Features: Offers a simple web-based uploading system for MS Word, RTF, or PDF documents. PDF versions are automatically created from MS Word or RTF files, including a cover page with metadata. Can be used both as an institutional repository system as well as a full-featured journal and book publishing tool (see the eScholarship repository as an example).
  • DigiTool
    • Presentation: A suite of software tools for managing digital resources, developed by ExLibris and especially adapted to library systems. It is coupled with METALib which allows harvesting of OAI records. It also allows sharing using Z39.50.
    • Features: Handles multiple metadata formats. Handles resumptionTokens (up to 1000 records) and sets but not deleted records. Currently has a bug that prevents it from harvesting sets larger than 1000 records.
  • DLXS
    • Presentation: The Digital Library eXtension Service (DLXS) created by the University of Michigan. The OAI feature can be used in conjunction with the collection manager portion of DLXS (collmgr) and the xpat search engine. It runs using a MySQL or CSV database, and is written in perl.
    • Features (version 13a): Handles sets, resumptionTokens, collection descriptions, and DLXS BibClass to oai_dc mapping.
    • Features (DLXS Broker20): The data provider component for DLXS systems. Allows set harvesting and resumptionTokens and multiple metadata formats, but does not yet support deleted records. Documentation lives at http://dlxs.org/docs/12a/collmeta/broker.html.
    • Features (UMHarvester/UMProvider): (see under Packages)
  • DSpace
    • Presentation: Originally developed by Hewlett Packard and MIT but currently managed as an open source development project coordinated by the DSpace Federation, DSpace is a suite of tools for creating online repositories. It includes the OCLC-developed data-provider tool OAICat. Runs on Linux, UNIX, and Windows systems. Software is downloadable at SourceForge. It is intended to hold born-digital assets.
    • Features (version 1.2): DSpace handles OAI sets, resumptionTokens and deleted records.
  • EPrints
    • Presentation: The oldest OAI repository software is an open source solution developed at the University of Southampton. It is primarily adapted to digital documents, such as pre- and post-prints, and grey literature for e-print repositories (self-archiving and open access).
    • Features (version 2.3.0): Supports version 1.1 and 2.0 of the OAI protocol. The default configuration only handles OAI-DC format, but it can be customized to support other formats. See the OAI FAQ - EPrints for issues associated with specific version of EPrints.
  • Fedora
    • Presentation: Open-source software developed by the University of Virginia and Cornell University for creating online repositories.
    • Features (version 2.1.1): Can disseminate any metadata format supported by the underlying Fedora repository. Uses the Proai data provider toolkit.
  • Greenstone
    • Presentation: Open-source, freely available software, cooperatively created by UNESCO and the New Zealand Digital Library Project, University of Waikato.
    • Features: Includes graphical interface for editing metadata. Embedded relational database works well in Greenstone, but with additional programming can be incorporated into other systems. Interoperable with DSpace. Greenstone can act as both an OAI data provider and a service provider, harvestring from other data providers in order to create new Greenstone collections. For more details refer to the Greenstone OAI Support page.
  • Insight®
    • Presentation: Insight empowers users to build digital collections of any size and manage, access, use and present those collections over a network or the Internet. Tools include working with images and multimedia, including zooming on image details, conducting side-by-side comparisons, annotating images and more. Complete cataloguing data accompanies every image, allowing for simple to more in depth searches across one or multiple collections, no matter where they reside.
    • Features: Share collections and access others. Centralize and secure visual resources. Manage and catalog visual collections.
  • Keystone
    • Presentation: The Keystone Digital Library Suite is a family of open source digital content management, portal management and information discovery software packaged together to provide libraries, museums and archives with state-of-the-art digital library services.
    • Features: Keystone can act as both an OAI data provider and an OAI service provider. As a data provider it can support multiple metadata formats depending on the availability of XSLT mapping files. For details refer to Keystone and OAI Metadata Services.
  • Metadata Migrator Tool
    • Presentation: Web-based program for migrating local files (formatted in .csv, .tab, or .dbf) into simple Dublin Core XML-formatted files. Available for free from the Emory University MetaScholar Initiative. Register to receive login and password.
    • Features (version 1.0): Allows harvesting using resumptionTokens. Does not yet support multiple metadata formats, deleted records, or automatic updating. Does not yet allow set harvesting.
  • Simple Digital Library
    • Presentation: A suite of software tools for managing and presenting digital resources, developed by Roaring Development. SimpleDL was developed for libraries and museums. It allows pricing according to the collection size, making it affordable for small collections.
    • Features: Full OAI support, including simple Dublin Core (oai_dc) and Qualified Dublin Core (qdc) formats. It handles deleted records, resumptionTokens, and sets. SimpleDL can be configured to only allow harvesting of specific collections.

[edit] Packages

  • Australian National University Harvester Service
    • Presentation: The Harvester Service is an open source OAI-PMH harvesting application based on OCLC's OAIHarvester2. It is a deployable webapp running under Tomcat providing harvest scheduling and management services for use by external applications. It also supports deployment of custom harvest classes for tailored processing on request and response data.
    • Features: Harvest scheduling and management, custom harvester classes.
  • jOAI
    • Presentation: This Java-based software contains a data provider and a data harvester. It stores XML files.
    • Features: The jOAI data provider allows XML files from a file system to be exposed as items in an OAI data repository and made available for harvesting by others using the OAI-PMH. The jOAI harvester is used to retrieve metadata records from remote OAI data providers and save them to the local file system, one record per file.
  • OAIbiblio
    • Presentation: OAIbiblio is a PHP-based data provider implementation of the OAI-PMH, version 2.0. This toolkit can be easily customized to communicate with an already existing, multi-table MySQL database.
    • Features (0.6 beta): Support for complex MySQL data structures. Supports XSLT for metadata transformations. Support for sets. See the project web page for details.
  • OAICat
    • Presentation: Java servlet Web application created at OCLC. Open Source software repository framework that can be easily customized using Java interfaces. A popular data-provider tool on DSpace.
    • Features (version 1.5.30): Supports version 2.0 of the protocol. Several sample implementations for various data sources are available, such as data stored as files on the file system, a data provider which is a gateway to an SRU/W search system, or a retrieving data from a database via JDBC.
  • OAIHarvester2 OAIHarvester
    • Presentation: The OAIHarvester2 Open Source Software (OSS) project is a Java application that provides an OAI-PMH harvester framework. It was developed at OCLC.
    • Features: Supports all features of the OAI 2.0 protocol. Although this is intended as a framework for developing custom OAI harvesters, it also includes a sample OAI harvester application.
  • Proai
    • Presentation: Proai is an OAI provider service written in Java, designed to be easily integrated with existing metadata repositories.
    • Features: Validates and caches records from a source repository for performance and reliability. Full support for sets and persistent deleted record status.
  • UIUC OAI Metadata Harvesting Project software
    • Presentation: The UIUC OAI Metadata Harvesting Project has created around 10 different software toolkits for creating OAI Data Providers & Harvesters. Most are implemented for the Microsoft Windows environment but a few are Java-based.
    • Features: The packages include various stand-alone toolkits for implementing various types of data providers, such as for data stored in relational databases or data stored as XML files. An implementation of an OAI Static Repository Gateway is also available. The toolkit also includes a full-featured OAI Harvester API implemented as a Windows ActiveX DLL along with a command-line harvesting tool. The suite of software also includes a thumbnail generation utility specifically tailored to generating thumbnails or thumbshots of resources associated with harvested OAI records.
  • University of Michigan OAI Toolkit
    • Presentation: The University of Michigan OAI Toolkit is a set of Perl based tools for harvesting OAI data and creating an OAI-PMH 2.0 compliant data provider.
    • Features (UMHarvester): An open-source harvester that provides simple and controlled harvesting of metadata and places harvested records into a browseable file system. Allows batch and incremental harvesting, and handles many vagaries of data provider repositories. The tool uses LWP for harvesting, allows for incremental harvesting, has multiple re-try options, and a batch harvest tool (Batch_UMHarvest) that can automatically perform incremental harvesting.
    • Features (UMProvider): An open-source data provider that relies heavily on libxml (XML::LibXML) and will store the data in nearly any relational database. It functions by creating a simple object-oriented interface for accessing a database of stored metadata records and responding to OAI requests.
  • Virginia Tech OAI Software
    • Presentation: Virginia Tech has developed a suite of OAI software toolkits including OAI data providers in several languages, for different architectures such as metadata files stored on the file system (XMLFile), for data stored in relation databases, OAI RSS transformation tools, and an OAI harvester. Most toolkits are written in Perl, but there is also a C++ implementation.
    • Features: Varies by tool, but all are open source and may be customized. Older toolkits may not support version 2.0 of the protocol.
    • Presentation: Developed by the University of Illinois at Urbana-Champaign as a way of providing OAI-PMH access to MARC records already accessible through Z39.50 gateways. Free, open source software, written in Visual Basic and VBScript, and easily modified.
    • Features: Is dependent on the features of the underlying Z39.50 target and may be difficult to configure on Z39.50 targets without the requisite features or indexes. The default configuration supports MARC XML and simple DC (oai_dc) metadata formats, but may be customized via additional XSLT transformations.
Personal tools